Unique Ancestors

What is an Unique Ancestor?

A unique ancestor is simply a person who is an ancestor, notwithstanding that they may have contributed to our descent in more than one direct line.

How few or how many Unique Ancestors do we actually have?

I present herewith a hypothesis for a distribution curve that may give us a clue in the quantisation of the unique ancestral counts. There is a complex relationship between the total ancestors we have in a binary tree sense and those who are unique.


For some time now I have been interested in the mathematics of ancestry. My interest was triggered by the early April 1997 discussion thread in soc.genealogy.medieval on the counts of ancestors to a particular generation, and the expectation of the number of unique ones that count included. Subsequently, I have found that other people share this same interest, and who have followed similar lines of analysis to mine.

There appeared to be a strange divergence in examples previously given between the growth rates of total and unique ancestors that warranted further examination. I therefore wrote a small computer program of my own to assist in the study of these effects. For a specific individual in a database, the program counts the recorded ancestral links by the unique people involved. I have used this information to extrapolate an empirical curve for unique ancestors. Another programmer has also written a program to perform these same calculations, reference to which is shown below.

Theory and Example Data

I emphasise that my "research" is basically empirical, not theoretical. I am also not an expert in mathematics, merely an interested observer and applicator. The phenomenon I describe must have previously been noted, and perhaps there are other explanations of which I am unaware.

For the purpose of these notes, I have defined a few special terms used in the table, as follows.

  • Total Ancestors . The total number of ancestral links given by 2^N -2, where N is the number of generations, counting the starting individual as generation one (as per Ahnentafel generation counting method).
  • Recorded Knowns . These are the count of actual links from the starting individual in the given database.
  • Recorded Uniques . These are the people in the database who are unique to the Recorded Ancestors.
  • Estimated Uniques . An extrapolation based on the known data.
  • Highest # Descents . These are the count from the database for the individual ancestor with the greatest number of descent paths or lines to the selected descendant.

The following details for Prince William of Wales from my own database cover the range to 80 generations (including the data of some legendary or mythical people), but it serves to illustrate the effects involved. Appropriate figures are shown in scientific notation for brevity. Whether or not we know exactly who our ancestors were by name, we can still be sure that people existed in those ancestral roles!

To Generation

Total Ancestors

Recorded Knowns

Recorded Uniques

Estimated Uniques

Highest #

1 0 0 0 0 0
2 2 2 2 2 1
3 6 6 6 6 1
4 14 14 14 14 1
5 30 30 30 30 1
6 62 62 62 62 1
7 126 126 122 122 2
8 254 253 236 237 2
9 510 482 422 445 4
10 1,022 885 723 827 6
11 2,046 1,589 1,191 1,508 12
12 4,094 2,792 1,861 2,649 17
13 8,190 4,978 2,858 4,517 28
14 16,382 8,973 4,301 7,476 45
15 32,766 16,369 6,356 11,962 77
16 65,534 29,943 8,917 18,216 193
17 131,070 55,278 12,191 26,685 354
18 262,142 102,698 16,062 37,385 694
19 524,286 191,703 20,305 49,882 1,286
20 1.0E+06 359,851 24,649 63,427 2,600
21 2.1E+06 676,348 28.708 76,874 5,906
22 4.2E+06 1.3E+06 32,668 90,904 8,384
23 8.4E+06 2.4E+06 36,414 105,207 14,106
24 1.7E+07 4.4E+06 39,759 119,050 26,759
25 3.4E+07 8.1E+06 42,568 131,761 42,097
26 6.7E+07 1.5E+07 45,001 143,975 80,021
27 1.3E+08 2.7E+07 47,038 155,492 155,445
28 2.7E+08 4.8E+07 48,706 166,222 245,424
29 5.4E+08 8.4E+07 49,993 175,696 370,665
30 1.1E+09 1.5E+08 50,915 183,510 874,387
31 2.1E+09 2.6E+08 51,615 190,425 1.6E+06
32 4.3E+09 4.4E+08 52,139 196,590 2.2E+06
33 8.6E+09 7.4E+08 52,497 201,734 3.3E+06
34 1.7E+10 1.2E+09 52,757 206,397 5.2E+06
35 3.4E+10 2.0E+09 52,941 210,607 8.8E+06
36 6.9E+10 3.1E_09 53,076 214,645 1.6E+07
37 1.4E+11 4.8E+09 53,188 219,141 2.6E_07
38 2.7E+11 7.3E+09 53,282 224,329 4.3E+07
39 5.5E+11 1.1E+10 53,370 231,146 7.4E+07
40 1.1E+12 1.6E+10 53,451 240,135 1.1E+08
41 2.2E+12 2.3E+10 53,522 251,694 1.8E+08
42 4.4E+12 3.2E+10 53,595 269,637 3.5E+08
43 8.8E+12 4.3E+10 53,660 294,491 5.7E+08
44 1.8E+13 5.7E+10 53,715 328,081 8.0E+08
45 3.5E+13 7.5E+10 53,754 366,894 1.0E+09
46 7.0E+13 9.7E+10 53,788 422,685 1.7E+09
47 1.4E+14 1.2E+11 53,823 517,611 2.3E+09
48 2.8E+14 1.5E+11 53,852 647,288 2.8E+09
49 5.6E+14 1.9E+11 53,878 839,235 3.2E+09
50 1.1E+15 2.4E+11 53,900 1,110,154 4.0E+09
51 2.3E+15 2.9E+11 53,916 see notes below 5.1E+09
52 4.5E+15 3.5E+11 53,927 6.3E+09
53 9.0E+15 4.3E+11 53,937 7.2E+09
54 1.8E+16 5.0E+11 53,946 7.7E+09
55 3.6E+16 5.9E+11 53,951 8.5E+09
56 7.2E+16 6.7E+11 53,959 9.2E+09
57 1.4E+17 7.5E+11 53,967 1.2E+10
58 2.9E+17 8.2E+11 53,972 1.3E+10
59 5.8E+17 8.8E+11 53,978 1.6E+10
60 1.2E+18 9.2E+11 53,985 1.8E+10
61 2.3E+18 9.6E+11 53,991 1.9E+10
62 4.6E+18 1.0E+12 53,999 1.9E+10
63 9.2E+18 1.0E+12 54,007 2.0E+10
64 1.8E+19 1.1E+12 54,015 2.0E+10
65 3.7E+19 1.1E+12 54,024 2.1E+10
66 7.4E+19 1.1E+12 54,034 2.1E+10
67 1.5E+20 1.1E+12 54,043 2.1E+10
68 3.0E+20 1.1E+12 54,051 2.2E+10
69 5.9E+20 1.2E+12 54,056 2.2E+10
70 1.2E+21 1.2E+12 54,059 2.2E+10
71 2.4E+21 1.2E+12 54,063 2.2E+10
72 4.7E+21 1.2E+12 54,068 2.2E+10
73 9.4E+21 1.2E+12 54,073 2.2E+10
74 1.9E+22 1.3E+12 54,077 2.2E+10
75 3.8E+22 1.3E+12 54,081 2.2E+10
76 7.6E+22 1.3E+12 54,085 2.2E+10
77 1.5E+23 1.4E+12 54,089 2.3E+10
78 3.0E+23 1.4E+12 54,091 2.3E+10
79 6.0E+23 1.4E+12 54,094 2.3E+10
80 1.2E+24 1.5E+12 54,096 2.3E+10

This high level of detail is shown to emphasise the magnitude of the numbers involved. At 80 generations (about 2000 years) we all have descended from 1.2E+24 total ancestors. This is a VERY BIG number. We all know for instance that the distance travelled by light in a year is a large number, but if put into perspective, the number of millimetres (a small but practical unit) travelled in a light year is only 9.5E+15. Compare this to 1.2E+24. Wow!

Since 1997 I have been researching and recording the ancestors of Prince William with kind assistance from a number of other people. In this time his ancestors in my database have grown from 13,000 to about 54,100 currently. In 1998 at the 40 generation point my records showed 21,400 out of a database total of 22,200. At that time my 40 generation projection of Estimated Uniques was 185,220. Compare that to now, where with 53,451ancestors at that same point the projection is 240,135. Considering the huge change in the number of people recorded, there is still a quite similar result for the projection.

The data we have is in effect a statistical sample, albeit a biased one. The greater the sample size, the higher the reliability of the information. In the above case, the percentages of Recorded Knowns to Total Ancestors at each 5 generation step are 100%, 78%, 44%, 32%, 22%, 12%, 4%, 1% and really tiny after 40 generations. The most recent generations therefore provide more reliable and complete data. For this reason I have only shown Estimated Uniques in the above table to 50 generations. Even at that point I believe the number is well overstated due to the sample size effect.

My database is not a particularly large one, but the growth in relationships is still significant. Furthermore, as the number of known people linked to the ancestral tree for the studied individual grows, the relationship counts grow at a larger but varying rate. These relationships result from the many complex interactions between descendants. Historically, the volume of data is unfortunately weak on the female side. When the data of later generations is examined where full inclusion of female ancestry applies, a much richer set of complex links result. As more people are added to the "tree" and linked to existing ancestors, the Recorded Knowns grow significantly. This has happened over the last five years in my data.

Prince William's Ancestry

The information I have compiled so far on the ancestry of Prince William shows some interesting numbers. It is a work in progress and details change day by day.

Ancestors of Prince William
recorded in database
Ancestors of Prince Charles
recorded in database
Ancestors of Lady Diana
recorded in database
Unique ancestors Prince Charles
Unique ancestors Lady Diana
Common ancestors Charles & Diana
Descent paths from William the Conqueror:
Descent lines
between 27 and 44 generations
Common Family Members
People ancestral to Prince William and descendants of William the Conqueror.
Descent paths from Charlemagne:
Descent lines
between 33 and 64 generations
Common Family Members
People ancestral to Prince William and descendants of Charlemagne.
Other details in my database:
Ancestors with parents
recorded in database
Male ancestors with no parents
recorded in database
Female ancestors with no parents
recorded in database
Single fathers
recorded in database
Single mothers
recorded in database

Inbreeding Analysis

We often hear and see the comment that the royal families are inbred. This is generally a fallacy and derives from an emotive impression of the effects of remote cousin marriages. In the case of Prince William I have analysed his ancestry using my program ATMatch, described below. This shows that in 12 generations of parental ancestry he has a Wright Coefficient of Inbreeding of 0.0052% representing 33 common parental ancestors occupying 113 ancestral positions (out of 16,380). In addition, his Coefficient of Ancestry (a related measure showing where he has common ancestry regardless of which parent has the links) is 1.835% representing 701 ancestral repetitions occupying 2,821 ancestral positions. These coefficients are very insignificant.

Conclusion and Projection

My conclusion is that we all probably all have less than 1,000,000 real unique ancestors, within 80 generations.

This conclusion arises from my observation of the data available to me, in particular the first 40 generations. The following indicative curve shows my predicted pattern of incremental unique ancestors for succeeding generations. The vertical axis is the number of uniques per generation, and the horizontal the number of generations. The latter parts of the curve are logically derived, and may ultimately end with two people if taken to the limit.

There are mobility factors in the population that affect groups of people in different ways and alter the ancestral composition accordingly. The land owning classes tend to live and marry within small geographic regions, whereas itinerant workers move around the country so spreading their genes. This seems to be partially offset in the case of royals, as there is considerable mixture across wider social and political boundaries. It will take a lot more investigation before these patterns are clear.

The shape is similar to that of the skewed statistical F Distribution, but I do not profess to have a sensible explanation for this result. The peak appears to occur at around 25 to 30 generations, and moves a bit over the centuries. This may reflect the consequences of wars and famines.

For any individual, a hypothetical set of unique ancestors could accumulate as follows:-

To Generation

Total Unique Ancestors

5 30
10 850
15 11,080
20 49,950
25 121,100
30 205,100
35 284,300
40 348,700
45 394,700
50 427,000
55 453,000
60 475,000
65 494,000
70 511,000
75 527,000
80 542,000

I hope that these notes provoke some further thought and contributions on this topic.

Extended Analysis

On request I have provided some further details from my study of Prince William's ancestry. Click Here to view the extra information.

Ftriplet Program

My simple FTRIPLET analysis program is freely available here. It is a useful tool to review an individual’s relationships within their databases. It is menu driven and runs under DOS or the command prompt in Windows 95 and NT and reads its data from a GEDCOM file.

Please use the program to examine the fascinating results of the relationships in your own databases, and be sure to let me know if you find any particularly complex or interesting ones. I would appreciate the data to help further research in the theory.

Its features include: -

  • output lists of descent results for all individuals to a specified depth of generations, to give an indication of the complexity of the relationships between the people.
  • list all the actual ancestral lines between two people, listed in both ancestral and descendant sequences with optional full name listing. The sex of the parent is displayed, to assist in the review of genetic trail patterns.
  • list the descendants of a given person at a specified generation depth.
  • list the longest male, female and alternating gender descent paths between two people.
  • calculate the shortest descent line between two people in one simple step.
  • list the common ancestors for two people showing their cousin relationships with removal level.
  • list the people in the file without parents (orphans).
  • list a birth century analysis of the people.

Download the program from the following page: - software.gif (2008 bytes)

ATMatch Program

This Windows program provides a means for comparing two separate ahnentafel lists for the same individual to identify the similarilies and differences between them. I use it to compare the information in my database with that of another person.

Input is through special ahnentafel listings that can be separately derived from a GEDCOM file, or in a specific format from another researcher.

Program features include: -

  • calculate the Wright Coefficient of Inbreeding (very fast routine) and my own Fettes Coefficient of Ancestry for the subject
  • utility routine to join ahnentafel (AT) numbers together, using either Kekule or British number formats
  • log the differences found between the files
  • optional tagging and various matching features

Download the program from the following page: - software.gif (2008 bytes)

Gedutil Program

In addition to my own FTRIPLET and ATMATCH analysis programs, I have made use of a utility program written by Torben B. Andersen. This program is command line driven and provides a number of very useful facilities for examination and manipulation of GEDCOM files. It is called GEDUTIL.

It runs significantly faster than my own program in the calculation of unique ancestors, though providing a different set of other features.

Among its features is the ability to: -

  • find orphans, being people without parents and spouses
  • find people recorded in error as their own grandparents
  • find both male and female ancestors without parents
  • find islands of individuals who are not linked to the main group
  • search for names in data
  • prepare a ahnentafel list of ancestors to a nominated generation
  • prepare a summary table of unique ancestors
  • list both ancestors and descendants for an individual
  • find common ancestors of two nominated individuals
  • extract a portion of the data to a new GEDCOM file.

The program was freely available on e-mail request to the author Torben B. Andersen but may no longer be available.

