DNA Study From The U.S National Library of Medicine National Institutes of Health-Nearly All Southern Europeans Have 1%-3% African Ancestry (Roman Era and Arab Migrations)-All 8 Major Jewish Populations Have 3%-5% Sub-Saharan African Ancestry In Them (Reflects African Ancestry Prior to Jewish Diaspora)

Source: The History of African Gene Flow into Southern Europeans, Levantines, and Jews (U.S National Library of Medicine National Institutes of Health) 2011 A.D.

From the source, “…Previous genetic studies have suggested a history of sub-Saharan African gene flow into some West Eurasian populations after the initial dispersal out of Africa that occurred at least 45,000 years ago. However, there has been no accurate characterization of the proportion of mixture, or of its date. We analyze genome-wide polymorphism data from about 40 West Eurasian groups to show that almost all Southern Europeans have inherited 1%–3% African ancestry with an average mixture date of around 55 generations ago, consistent with North African gene flow at the end of the Roman Empire and subsequent Arab migrations. Levantine groups harbor 4%–15% African ancestry with an average mixture date of about 32 generations ago, consistent with close political, economic, and cultural links with Egypt in the late middle ages. We also detect 3%–5% sub-Saharan African ancestry in all eight of the diverse Jewish populations that we analyzed. For the Jewish admixture, we obtain an average estimated date of about 72 generations. This may reflect descent of these groups from a common ancestral population that already had some African ancestry prior to the Jewish Diasporas

Southern Europeans and Middle Eastern populations are known to have inherited a small percentage of their genetic material from recent sub-Saharan African migrations, but there has been no estimate of the exact proportion of this gene flow, or of its date. Here, we apply genomic methods to show that the proportion of African ancestry in many Southern European groups is 1%–3%, in Middle Eastern groups is 4%–15%, and in Jewish groups is 3%–5%. To estimate the dates when the mixture occurred, we develop a novel method that estimates the size of chromosomal segments of distinct ancestry in individuals of mixed ancestry. We verify using computer simulations that the method produces useful estimates of population mixture dates up to 300 generations in the past. By applying the method to West Eurasians, we show that the dates in Southern Europeans are consistent with events during the Roman Empire and subsequent Arab migrations. The dates in the Jewish groups are older, consistent with events in classical or biblical times that may have occurred in the shared history of Jewish populations…

The history of human migrations from Africa into West Eurasia is only partially understood. Archaeological and genetic evidence indicate that anatomically modern humans arrived in Europe from an African source at least 45,000 years ago, following the initial dispersal out of Africa [1], [2]. However, it is known that Southern Europeans and Levantines (people from modern day Palestine, Israel, Syria and Jordan) have also inherited genetic material of African origin due to subsequent migrations. One line of evidence comes from Y-chromosome [3] and mitochondrial DNA analyses [4]–[6]. These have identified haplogroups that are characteristic of sub-Saharan Africans in Southern Europeans and Levantines but not in Northern Europeans [7]. Auton et al. [8] presented nuclear genome-based evidence for sharing of sub-Saharan African ancestry in some West Eurasians, by identifying a North-South gradient of haplotype sharing between Europeans and sub-Saharan Africans, with the highest proportion of haplotype sharing observed in south/southwestern Europe. However, none of these studies used genome-wide data to estimate the proportion of African ancestry in West Eurasians, or the date(s) of mixture. Throughout this report, we use “African mixture” to refer to gene flow into West Eurasians since the divergence of the latter from East Asians; thus, we are not referring to the much older dispersal out of Africa ∼45,000 years ago but instead to migrations that have occurred since that time…

We assembled data on 6,529 individuals drawn from 107 populations genotyped at hundreds of thousands of single nucleotide polymorphisms (SNPs) (Table S1). This included 3,845 individuals from 37 European populations in the Population Reference Sample (POPRES) [9], [10], 940 individuals from 51 populations in the Human Genome Diversity Cell Line Panel (HGDP-CEPH) [11], [12], 1,115 individuals from 11 populations in the third phase of the International Haplotype Map Project (HapMap3) [13], 392 individuals who self reported as having Ashkenazi Jewish ancestry from the InTraGen Population Genetics Database (IBD) [14] and 237 individuals from 7 populations in the Jewish HapMap Project [15]. For most analyses, we used HapMap3 Utah European Americans (CEU) to represent Northern Europeans and HapMap3 Yoruba Nigerians (YRI) to represent sub-Saharan Africans, although we also verified the robustness of our inferences using alternative populations.

We curated these data using Principal Components Analysis (PCA) [16] (Table S2), with the most important steps being: (i) Removal of 140 individuals as outliers who did not cluster with the bulk of samples of the same group, (ii) Removal of all 8 Greek samples as they separated into sub-clusters in PCA so that it was not clear which of these clusters was most representative, (iii) Splitting the Bedouins into two genetically discontinuous groups, and (iv) Reclassifying the 5 Italian groups into three ancestry clusters (Sardinian, Northern-Italy, and Southern-Italy) (see details in Text S1, Figure S1). A comparison of results before and after this curation is presented in Table S3, where we show that this data curation does not affect our qualitative inferences.

To study the signal of African gene flow into West Eurasian populations, we began by computing principal components (PCs) using San Bushmen (HGDP-CEPH- San) and East Eurasians (HapMap3 Han Chinese- CHB), and plotted the mean values of the samples from each West Eurasian population onto the first PC, a procedure called “PCA projection” [17], [18]. The choice of San and CHB, which are both diverged from the West Eurasian ancestral populations [19], [20], ensures that the patterns in PCA are not affected by genetic drift in West Eurasians that has occurred since their common divergence from East Eurasians and South Africans. We observe that many Levantine, Southern European and Jewish populations are shifted towards San compared to Northern Europeans, consistent with African mixture, and motivating formal testing for the presence of African ancestry (Figure 1, Figure S2)…

To formally test for the presence of African mixture, we first performed the 4 Population Test (Figure S3). This test is based on the insight that if populations A and B form sister groups relative to C and D, the allele frequency differences (pA-pB) and (pC-pD) should be uncorrelated as they represent independent periods of random genetic drift [21]. Applying the 4 Population Test to the proposed relationship (YRI,(Papuan,(CEU,X))) where X is a range of West Eurasian populations, we find significant violations for all Southern European, Jewish and Levantine populations but not for Northern Europeans (Table 1). The results remain unchanged even when we use alternate topologies replacing YRI with other African populations (Text S2, Table S4). We further verified these inferences with the 3 Population Test [21], which capitalizes on the insight that for any 3 populations (X; A, B), the product of the allele frequency differences (pX-pA) and (pX-pB) is expected to be negative only if population X descends from a mixture of populations related to populations A and B[21] (Figure S3). We verified that this method is robust to SNP ascertainment bias by carrying out simulations showing that the 3 Population Test detects real admixture even if all SNPs used in the analysis are discovered in population A, population B, or in both populations A and B (Text S3; Table S5; Figure S4). Application of the test to each West Eurasian population (using A = YRI and B = CEU) finds little or no evidence of mixture in North Europeans but highly significant evidence in many Southern European, Levantine and Jewish groups (Table 1)…

To estimate the proportion of sub-Saharan African ancestry in the various West Eurasian populations that showed significant evidence of mixture, we used f4 Ancestry Estimation [21], a method which produces accurate estimates of ancestry proportions, even in the absence of data from the true ancestral populations. This method estimates mixture proportions by fitting a model of mixture between two ancestral populations, followed by (possibly large) population-specific genetic drift. Briefly, we calculate a statistic that is proportional to the correlation in the allele frequency difference between West Eurasians and sub-Saharan Africans, and divide it by the same statistic for a population of sub-Saharan African ancestry, like YRI (Figure 2). This method has been shown through simulation to be robust to ascertainment bias on the SNP arrays and deviations from the assumed model of mixture (e.g. date and number of mixture events) [21]…

Application of f4 Ancestry Estimation suggests that the highest proportion of African ancestry in Europe is in Iberia (Portugal 3.2±0.3% and Spain 2.4±0.3%), consistent with inferences based on mitochondrial DNA [6] and Y chromosomes [7] and the observation by Auton et al. [8] that within Europe, the Southwestern Europeans have the highest haplotype-sharing with Africans. The proportion decreases to the north and we find no evidence for mixture in Russia, Sweden and Scotland (Table 2, Figure S5). We also detect about 3-5% sub-African ancestry in all the Jewish populations, a finding that is novel as far as we are aware, and certainly has not been unambiguously demonstrated or quantified. For Levantines, the proportions are often higher: 9.3%±0.4% in Palestinians and >10% in the Bedouins (standard errors were calculated using a Block Jackknife as described in Materials and Methods). Table 2 presents the ancestry estimates that we obtain for all West Eurasian populations with significant evidence of mixture by the 4 Population Test (Z-score < -3). To test if our inferences are dependent on the sub-Saharan African population that was used as the reference group, we also repeated analyses with other sub-Saharan African populations replacing YRI. This analysis shows that our estimates of mixture proportions do not change significantly based on the ancestral population used (Text S2c, Table S6). We obtained similar estimates when we applied STRUCTURE 2.2 [22] to estimate the mixture proportions using ∼13,900 independent markers (that were not in linkage disequilibrium (LD) with each other) (Table 2, Figure S6)…

The finding of sub-Saharan African ancestry in West Eurasians predicts that there will be a signature of admixture LD in the populations that experienced this mixture. That is, there will be LD between all markers that are highly differentiated between the two ancestral populations and the allele will be strongly correlated to the local ancestry [23]. Hence, there will be chromosomal segments of African ancestry with lengths that reflect the number of recombination events that have occurred since mixture, and thus can be used to estimate an admixture date. Figure 3 shows that this expected pattern is observed empirically in the decay of LD in four example West Eurasian populations, where we enhance the effects of admixture LD by weighting the SNP comparisons by frequency difference between the ancestral Africans (YRI) and ancestral West Eurasians (CEU). In the Southern European, Jewish and Levantine populations, this procedure produces clear evidence of admixture LD (Figure 3). However, Northern Europeans (Russians in Figure 3) do not show any evidence of African gene flow, consistent with the 4 Population and 3 Population Test results and Figure 1. Similar results are seen for other West Eurasian and Jewish populations that show evidence of mixture in the 4 Population Test

To estimate a date for the mixture event, we developed a novel method ROLLOFF that computes the time since mixture using the rate of exponential decline of admixture LD in plots such as Figure 3. ROLLOFF computes the correlation between a (signed) statistic for LD between a pair of markers and a weight that reflects their allele frequency differentiation in the ancestral populations. By examining the correlation between pairs of markers as they become separated by increasing genetic distance and fitting an exponential distribution to this rolloff by least squares, we obtain an estimate of the date (see Materials and Methods and Text S4). ROLLOFF also computes an approximately normally distributed standard error by carrying out Weighted Jackknife analysis [24], where we drop one chromosome in each run and study the fluctuation of the statistic in order to assess the stability of the estimate.

To verify the accuracy and sensitivity of ROLLOFF, we carried out extensive simulations by constructing the genomes of individuals of mixed ancestry by sampling haplotypes from North Europeans (CEU) and West Africans (YRI) (see Materials and Methods). We verified that ROLLOFF produces accurate estimates of the date of mixture, even in the case of old admixture (up to 300 generations – Figure 4) and is robust to substantially inaccurate ancestral populations as well as fine scale errors in the genetic map (Text S4; Figure S7; Figure S8; Table S7; Table S8). In addition, to test the robustness of our inferences, we applied all the methods to African Americans and obtained consistent results for the proportion of mixture (79.4±0.3%) and date of mixture (6±1), which is in agreement with previous reports [25], [26]. However, in the case of low mixture proportion and old admixture dates, we observed that there is a slight bias in the estimated date (Text S4d, Table S9). This effect is related to the weakness of the signal: it attenuates as the sample size or admixture proportion becomes larger (Text S4d, Table S10, Table S11)…

An important concern was how ROLLOFF would perform when the true history of admixture involved multiple pulses of gene exchange, rather than the single pulse of gene exchange that we modeled. To explore this, we first simulated two distinct gene flow events, and then estimated the date using a single exponential distribution. The simulations show that ROLLOFF‘s estimate of the date tends to correspond reasonably well to the more recent admixture event, with a slight upward bias towards the older date. Second, we performed simulations under a continuous gene flow model and found that the estimated dates are intermediate between the start and end of the gene flow, as expected (Figure S9; Figure S10; Table S12). To explore if we could obtain a better inference of the range of dates, we tried fitting sum of multiple exponential distributions, but this did not work reliably, which may be related to the well-known difficulty of fitting a sum of exponentials to data with even a small amount of noise [27] (Text S4). Pool and Nielsen recently showed that multi-marker haplotype data could be useful for distinguishing a single pulse of gene exchange from changing migration rates over time [28]. However, a complication with applying this approach to relatively old dates is that haplotype-based methods need to model background LD. In the case of old mixture events (dozens or hundreds of generations), inaccurate modeling of background LD can bias estimates [26], [29]. We are not aware of any published method that can produce accurate date estimates while modeling background LD correctly for mixture dates as old as those that have been explored by ROLLOFF in Figure 4.

We applied ROLLOFF to all the West Eurasian populations that gave significant signals of mixture by the 4 Population Test, fitting a single exponential decay in each case. We estimate that the date of sub-Saharan African mixture in Portugal is 45±5 generations and in Spain is 55±3 generations. We estimate a more recent date of 34±3 for Bedouin-g1, 33±2 for Bedouin-g2, and 34±2 generations for Palestinians. We estimate older dates of ∼70–150 generations in the various Jewish populations, with wide and in most cases overlapping confidence intervals (Table 2; Figure S11). Averaging the mixture dates over all populations from each region (weighted by the inverse of the squared standard error), we obtain an average of 55 generations for Southern Europeans, 34 for Levantines and 89 for Jews.

As described above, in our simulations to explore the behavior of ROLLOFF we detect an upward bias in the date estimates that grew worse with older mixture dates, small mixture proportions, and small sample sizes (but does not appear to be affected by use of inaccurate ancestral populations). To assess the degree to which this bias might be affecting our date estimates, we performed simulations for each population in Table 2 separately, in which we set the number of samples, mixture proportion and time since mixture to match the parameters estimated from the real data. We repeated our simulations 100 times for each parameter setting and estimated the bias of our estimated date from the true (simulated) date. The bias is very small for the most of the Southern European and Levantine samples, which generally had large sample sizes, recent dates, and high mixture proportions. However, the bias is larger for the Jewish groups (Table 2, Table S13). Correcting for the bias inferred in our simulation of Table S12, we obtain corrected estimates of the average date of 55 generations for Southern Europeans, 32 for Levantines, and 72 for Jews. A caveat about these regional date estimates is that they reflect weighted averages across the populations in each region. However, the admixture events detected within each region may not reflect the same historical events; for example, it is plausible that the sub-Saharan African admixture in Spain and Italy have different historical origins…

The finding of African ancestry in Southern Europe dating to ∼55 generations ago, or ∼1,600 years ago assuming 29 years per generation [30], needs to be placed in historical context. The historical record documents multiple interactions of African and European populations over this period. One potential opportunity for African gene flow was during the period of Roman occupation of North Africa that lasted until the early 5th century AD, and indeed tomb inscriptions and literary references suggest that trade relations continued even after that time [31], [32]. North Africa was also a supplier of goods and products such as wine and olive oil to Italy, Spain and Gaul from 200–600 AD, and Morocco was a major manufacturer of the processed fish sauce condiment, garum, which was imported by Romans [33]. In addition, there was slave trading across the western Sahara during Roman times [7], [34]. Another potential source of some of the African ancestry, especially in Spain and Portugal, is the invasion of Iberia by Moorish armies after 711 AD [35], [36]. If the Moors already had some African ancestry when they arrived in Southern Europe, and then admixed with Iberians, we would expect the admixture date to be older than the date of the invasion, as we observe.

The signal of African mixture that we detect in Levantines (Bedouins, Palestinians and Druze) – an average of 32 generations or ∼1000 years ago – is more recent than the signal in Europeans, which might be related to the migrations between North Africa and Middle East that have occurred over the last thousand years, and the proximity of Levantine groups geographically to Africa. Syria and Palestine were under Egyptian political control until the 16th century AD when they were conquered by the Ottoman Empire. This is in concordance with our proposed dates. In addition, the Arab slave trade is responsible for the movement of large numbers of people from Africa across the Red Sea to Arabia from 650 to 1900 AD and probably even prior to the Islamic times [7], [37]. We caution that our sampling of the Middle East is sparse, and it will be of interest to study African ancestry in additional groups from this region.

A striking finding from our study is the consistent detection of 3–5% sub-Saharan African ancestry in the 8 diverse Jewish groups we studied, Ashkenazis (from northern Europe), Sephardis (from Italy, Turkey and Greece), and Mizrahis (from Syria, Iran and Iraq). This pattern has not been detected in previous analyses of mitochondrial DNA and Y chromosome data [7], and although it can be seen when re-examining published results of STRUCTURE-like analyses of autosomal data, it was not highlighted in those studies, or shown to unambiguously reflect sub-Saharan African admixture [15], [38]. We estimate that the average date of the mixture of 72 generations (∼2,000 years assuming 29 years per generation [30]) is older than that in Southern Europeans or other Levantines. The point estimates over all 8 populations are between 1,600–3,400 years ago, but with largely overlapping confidence intervals. It is intriguing that the Mizrahi Irani and Iraqi Jews—who are thought to descend at least in part from Jews who were exiled to Babylon about 2,600 years ago [39], [40]—share the signal of African admixture. (An important caveat is that there is significant heterogeneity in the dates of African mixture in various Jewish populations.) A parsimonious explanation for these observations is that they reflect a history in which many of the Jewish groups descend from a common ancestral population which was itself admixed with Africans, prior to the beginning of the Jewish diaspora that occurred in 8th to 6th century BC [41]. The dates that emerge from our ROLLOFF analysis in the non-Mizrahi Jews could also reflect events in the Greek and Roman periods, when there were large communities of Jews in North Africa, particularly Alexandria [34], [42]. We detect a similar African mixture proportion in the non-Jewish Druze (4.4±0.4%) although the date is more recent (54±7 generations; 44±7 after the bias correction). Algorithms such as PCA and STRUCTURE show that various Jewish populations cluster with Druze [15], which coupled with the similarity in mixture proportions, is consistent with descent from a common ancestral population. Importantly, the other Levantine populations (Bedouins and Palestinians) do not share this similarity in the African mixture pattern with Jews and Druze, making them distinct in their admixture history.

A caveat to these results is that we estimated dates assuming instantaneous mixture, but in fact we have not distinguished between the patterns expected for instantaneous admixture and continuous gene flow over a long period. In Text S4f, we report simulations showing that for continuous gene flow, the dates from ROLLOFF reflect the average of mixture dates over a range of times, and so the date should be interpreted only as an average number.

A potential issue that could in theory influence our findings is that the exact population contributing to African ancestry in West Eurasians is unknown. To gain insight into the African source populations, we carried out PCA analyses, which suggested that the African ancestry in West Eurasians is at least as closely related to East Africans (e.g. Hapmap3 Luhya (LWK)) as to West Africans (e.g. Nigerian Yoruba (YRI)) (the same analyses show that there is no evidence of relatedness to Chadic populations like Bulala) (Text S5 and Figure S12). We also used the 4 Population Test to assess whether the tree ((LWK, YRI),(West Eurasian, CEU)) is consistent with the data, and found no evidence for a violation, which is consistent with a mixture of either West African or East African ancestors or both contributing to the African ancestry in West Eurasians (Table S14; Figure S13). Historically, a mixture of West and East African ancestry is plausible, since African gene flow into West Eurasia is documented from both West Africa during Roman times [34] and from East Africa during migrations from Egypt [7]. It is important to point out, however, that the difficulty of pinpointing the exact African source population is not expected to bias our inferences about the total proportion and date of mixture. The f4 Ancestry Estimation method is unbiased even when we use a poor surrogates for the true ancestral African population (as long as the phylogeny is correct), as we confirmed by repeating analyses replacing YRI with LWK, and obtaining similar results (Table S15). Our ROLLOFF admixture date estimates are also similar whether we use LWK or YRI to represent ancestral African population (Table S15), as predicted by the theory.

In summary, we have documented a contribution of sub-Saharan African genetic material to many West Eurasian populations in the last few thousand years. A priority for future work should be to identify the source populations for this admixture…

We analyzed individuals of West Eurasian ancestry from several sources: The Population Reference Sample (POPRES) [9]–[10] (n = 3,845 samples from 37 populations genotyped on an Affymetrix 500K array), the Human Genome Diversity Cell Line Panel (HGDPCEPH) [12] (n = 940 samples from 51 populations genotyped on an Illumina 650K array), The International Haplotype Map (HapMap) Phase 3 [13] (n = 1,115 samples from 11 populations genotyped on an Illumina 1M array), the InTraGen Population Genetics Database (IBD) [14] (n = 392 Ashkenazi Jews genotyped on an Illumina 300K array) and the Jewish HapMap Project [15] (n = 237 from 7 Jewish populations genotyped on an Affymetrix 6.0 array). We created a merged dataset containing 6,529 individuals -out of which 3,614 individuals of West Eurasian, African and Eastern Eurasian ancestry were used for the final analysis. Detailed information about the number of individuals and markers included in each analysis is provided in Table S1. We used NCBI Build 35 to determine physical position and the Oxford LD-based map genetic to determine genetic positions of all SNPs [43]…”

This source confirms that almost all southern Europeans have some level of African ancestry in them. This source confirms that all 8 major Jewish populations have “sub-Saharan” African DNA in them. This could be used as a tool to show to that the ancient, common ancestral population each of these groups diverge from had black Israelites and/or black people within it.

By One For All

Leave a Reply

Have You Seen These?

%d bloggers like this: