DNA Haplogroups are both awesome and ridiculous

Tim Piatenko
6 min readMay 9, 2021

When I get into something, I tend to REALLY get into it. I don’t stop till I’ve learned as much as I can, or something new more exciting comes along :) For the past year or so, I’ve been neck deep into human DNA ancestry. It’s a fascinating and quickly developing topic, being fueled by advances in genetics, statistics, and computing power. We can extract and analyze the entire genome of any organism, and run statistical analyses on large samples. It’s a very rich field, and as such, it pretty dam confusing at times.

I’m writing this as much for myself, really, as for anyone else out there who is interested in this topic. In my last article, I mentioned MyTrueAncestry and the ability to use their service to perform DNA matching against historical / archeological samples. One of the tools they provide is Haplogroup matching. Now, haplogroups themselves are not that complicated in theory — it’s a specific genetic mutation that gets passed down from one generation to the next without changes, or at least with “nested” changes that result in subgroups of the original without crossing into other groups’ territories.

Map of human migration with maternal haplogroup overlays

Haplogroups are a great way to trace DIRECT lineages back thousands of years. They are very useful for mapping human migrations using nested mutations in our genetic code. They are also being used by many to identify with their heritage… especially aristocratic and noble families that treasuer their pedigree. And this is where I have a problem with the math… becuse to really appreciate what a haplogroup means, you need to put it in PERSPECTIVE against the entire genome. It’s literally ONE line of descent in your giant family tree.

So just how giant is our ancestral pool? Let’s being with some basics.

DNA is made up of 4 types of Nucleotides — the “bars” between the “bands” in the DNA double helix structure.

Basic DNA structure

These make up Genes, which make up Chromosomes, which come in (23 for humans) pairs and live in cell nuclei.

The pairs are inherited from your parents, and most of it is random though the process of Recombination. That’s why we inherit a mix of traits and don’t look like clones of our siblings. Or why genes may “skip a generation” and make you look like your (great) grandparents.

So we have 23 pairs of these chromosome things inside our nuclei. 22 of them are the same in males and females. The last pair is the sex-determining one — ladies have two X’s, while guys have an X and a Y (which is a midget…)

The fun part is that the Y-chromosome for the most part does NOT recombine! 90% of it gets passed down with almost no change from father to son. And that is where haplogroups come from. Apart from downstream mutations the create further branches within the main haplogroup branch, that part of the DNA is unchanged.

Unfortunately for the ladies, they do not inherit the Y… However, there’s an alternative that lives outside of the nucleus in something called Mitochondria (which may very well be an ancient symbiotic bacteria that got incorporated into our own bodies…)

Mitochondria do not participate in the usual reproduction process and also avoid the recombination mess. Thus, like the Y, the so-called mtDNA is also traceable back in time for thousands of generations.

The rest of our DNA is Autosomal and inherited from ALL of our ancestors.

So how much of each type is there?? Well, each chromosome is different, but they contain hundreds to thousands of genes, resulting in a total of about 20–25K. After poking around, I found this info:

  • The nuclear genome comprises approximately 3 200 000 000 nucleotides of DNA, divided into 24 linear molecules, the shortest 50 000 000 nucleotides in length and the longest 260 000 000 nucleotides, each contained in a different chromosome.
  • The mitochondrial genome is a circular DNA molecule of 16 569 nucleotides, multiple copies of which are located in the energy-generating organelles called mitochondria.

So… 3 billion nucleotides in nuclear DNA! And your haplogroup is determined by… one? Let’s take a look. Haplotypes are determined by SNPs (single-nucleotide polymorphisms), which are literally a single flip of a nucleotide! A haplotype can be determined by a set of SNPs, but then they are linked, so in the end only a handful are needed to determine unique group.

Let’s compare that to the autosomal SNPs. 23andME sequences only a tiny portion of the genome, 630132 autosomal SNPs, 4318 mitochondrial, 16530 X, and 3733 Y. MyHeritage is similar, except 2x X and no mtDNA at all.

So you fraternal DNA is half a percent of your autosomal total, and maternal is less than 3 percent. And of that, only a few determine your “recent” haplogroup. My maternal T2a1b1a1, for example, is between 5.7 and 11.1 thousand years old! That’s mesolithic period… So someone with my haplogroup may have had no other relations in common for 10K years :)

DNA inherited from past generations

So when someone is claiming genetic affinity to a certain group or ethnicity based on haplotypes… I don’t get it. In fact, I don’t get the whole “royal line” argument at all. How can it make sense, when it’s such a tiny portion of your lineage?? And given the ages of these haplotypes, direct ancestry is inapplicable, unless you take one of the “full” DNA sequencing tests.

I got curious about this topic when 23andMe told me my N1c1a (N-M178) haplogroup includes Rurik — the original ruler of Rus that evolved into Russia of today. I discovered that there’s a large community trying to validate their noble lineages via haplogroup affinity. But most of these go no further than this. So all of them (us?) have a common male ancestor 10K years ago… Great. Rurik lived in the 9th century AD, only 1000 years back. So this proves what exactly? 🙄

MyTrueAncestry Haplogroup graph for my 23andMe data

I guess my point is that while it’s a cute factoid, I will not get excited about my royal origins until they find and fully sequence some of the original Rus princes. Then maybe I can pony up and take the Big-Y test. Until then, I’ll enjoy the fact that I do share 5 actual autosomal SNP chains of 60 to 400 SNPs each with Izjaslav Ingvarevych, a Rurikid prince who died in battle on the eve of the Mongol invasion in 13th century. Even if he’s not in my haplogroup 🤪

--

--

Tim Piatenko

I’m a Caltech particle physics PhD turned Data Scientist. Russia → Japan → US. Also on Mastodon @timoha@mastodon.world / @timoha@newsie.social 🐘