Archives
In this work clustering of
In this work, clustering of populations according to their geographic origin was more clearly visualized using haplotypic frequencies than the Fst distance matrix data based on individual SNP frequencies which
shows the greater informativity of using haplotypes to distinguish between populations of different ethnicity or geographic origin. The distinctive geographical clustering of populations from different continental regions is quite impressive considering that these results are based on variation of only 4 SNPs at one gene. Such results usually are not seen until one includes data from many, often dozens, of independent polymorphisms. Since other studies have also shown that COMT gene frequencies differ according to ethnicity (Palmatier et al., 1999; DeMille et al., 2002; Mukherjee et al., 2010; Gonzalez-Castro et al., 2013) in populations sampled worldwide. COMT gene polymorphism could be useful for forensic studies.
Within the cluster of North African populations the Sousse Thienoguanosine sample is somewhat different from the other six populations. One has to analyze more in depth the reasons of this isolation that has been shown by other genetic studies (Fadhlaoui-Zid et al., 2015). It seems that the particular genetic structure of Sousse would be more related to the high level of admixture of this population rather than to lack of genetic flow. Indeed, this city is from an ancient foundation by Phoenicians 1100 years BC, Interestingly, it is one of the rare Punic towns that escaped destruction by Romans during the Punic wars. The presence of the Phoenician Y chromosome at a frequency of about 10% in Sousse (Fadhlaoui-Zid et al., 2012; Fadhlaoui-Zid et al., 2015) confirms the continuity of this population since its foundation. Moreover its location on the sea and its economic role leading to commercial and human exchanges would explain the diversity observed at haplotypic level.
Indeed, if analysis of haplotype distribution gives information about population relationships, study of linkage disequilibrium might be related to the time of settlement of the population and to its level of admixture. LD profiles in the 22q11.2 location that contains the COMT gene showed a variation that depends on both the region of the gene and the population geographical origin (Mukherjee et al., 2010). High LD was demonstrated in the 5′ and 3′ regions but not in the coding region. The 3 studied SNPs (rs2020917, rs4818, rs4680) belong to the high LD regions in the Europeans, Asians (South West Asia and East Asia) and Native Americans populations (Mukherjee et al., 2010). In this study we also have found the same results but we have used a different strategy based on a small number of SNPs well distributed over the entire gene. In this region we found high levels of LD in North African populations too, the two pairs (rs4818-rs4680) and (rs4680-rs9332377) displaying the highest LD levels. The two SNPs (rs4818-rs4680) analyzed in this study are located in the coding region of the COMT gene, which is characterized by low haplotype diversity and a low level of LD except for the Eurasian populations (Mukherjee et al., 2010) and for the populations of North Africa as shown in this study. This could be probably due to less ancient origin and/or to high level of admixture. However, one cannot exclude drift and selection effects that might accompany population settlement history, including adaptation to geographical latitude.