GENOME-WIDE SNPs ANALYSIS OF INDIGENOUS ZEBU BREEDS IN PAKISTAN

Prospects of high throughput technology in animal genetics makes easy to investigate hidden genetic variation in farm animal’s genetic resources. However, many SNPs technologies are currently practicing in animal genetics. In this study, we investigated genome wide SNPs variations and its distribution across the indigenous cattle population in Pakistan using Illumina Bovine HD (777K) SNPs bead chip. A total of 136 individuals from ten different breeds were genotyped and after filtration 500, 939 SNPs markers were used for further analysis. The mean minor allele frequency (MAF) was 0.23, 0.20, 0.22, 0.22, 0.20, 0.18, 0.20, 0.22, 0.21 and 0.18 observed for Achi, Bhagnari, Cholistani, Dhanni, Dajal, Kankraj, Lohani, Red sindi, Sahiwal and Tharparkar cattle, respectively. Significant difference (P<0.001) of MAFs were observed in selected population. A common variants minor allele frequency (≥0.10 and≤ 0.5) was estimated (64%). Across all sampled populations 64% SNPs markers were observed polymorphic (MAF>0.05) within breeds and remaining 36% were considered as monomorphic markers. Average observed (Ho) and expected (HE) heterozygosity values 0.662 and 0.640 were estimated among these breeds. In conclusion, this preliminary study results revealed that these SNPs variation level could potentially be used for genetic characterization of zebu cattle breeds and could also be used to estimate genetic potential of these cattle breeds for livestock improvement in country.


Introduction
Bovine high density (HD) SNPs assay is a most comprehensive genotyping tool to explore genome variation with high coverage resolution across cattle breeds (Howard et al., 2015).This features more than 777,962 SNPs probes that are equally distributed across entire bovine genome (Leroy, 2014).This array was first time introduced in 2009 (Mbole- Kariuki et al., 2014).Applications of this array include genome wide association studies, quantitative trait loci identification, prediction of genetic merit, linkage disequilibrium and breed characterization (Pryce at el., 2014).The potential of this array has been proven in several studies that identified genomic regions which have strong contribution in phenotypic variation.These genomic regions that are related with feed efficiency and intake traits (Lin at el., 2010;Edea et al., 2014) milk production traits and meat type traits (Howard at el., 2015;Kim at el., 2015).
In addition, genomics values prediction in breeding programme based on genomic data have been extensively used for cattle selection (Edea et al., 2014).The genomic selection tools reliability is based mainly on linkage disequilibrium (LD) existence and their association between SNPs and QTL that affects the traits of interest (Caruthers et al., 2011;Curik at el., 2014;Kim at el., 2015).In U. S. A and other developed countries genomic information is widely used for genetic evaluation of farm animals (dairy and beef) (Howard at el., 2015).The Bovine HD SNP assay has also been used to identify copy number variations (CNV) that are used for QTL association with phenotypes (Bickhart at el., 2016).In addition, Bovine high density (HD) genotyping assay has also been used to detect genetic relationships among and within cattle breeds and also been applied to detect signature of selection in different dairy and beef breeds (Kim at el., 2015).
In Pakistan, all genetic improvement programmes for dairy and beef cattle breeds are based on conventional quantitative genetics methods.There is also limited availability of phenotypic and pedigree data information for estimation of breeding values in these breeds (Mustafa at el., 2014).Conventionally, the genetic structure of economically important traits was considered to be a black box with little information of the genes variations affecting phenotypic expression of these traits, gene interactions, and the location of these genes in the genome (Decker et al., 2014;Hussain at el., 2016).Meanwhile, it has been found that genetic selection has a high probability to increase genetic gain in cattle and also permits more accurate genetic predictions for traits of low heritability in farm animals than conventional phenotypic selection (Groeneveld at el., 2010;Curik at el., 2014;Leroy, 2014;Kim et al., 2015).Currently, indigenous cattle breeds in Pakistan still lack the opportunity for high throughput evaluation.To better understand complex evolutionary process and breeding improvement programmes.To date, no indigenous Pakistani cattle breed has been included either in training or a validation population using the Bovine HD SNP BeadChip (Mustafa at el., 2014).
Therefore, it is necessary to assess the usefulness of the Bovine HD SNP BeadChip in indigenous Pakistani cattle breeds.The evaluation of this high throughput technique would be help to improve the cattle farming and establish a reference population.Therefore, the aim of this analysis was to find the level of informativeness of Bovine HD SNP BeadChip by measuring loci polymorphism in indigenous cattle population in Pakistan.

Animals sampling, genomic DNA extraction and Genotyping
A 10ml Jugular blood samples were obtained from ten different breeds from potential agro-geographical area of these breeds using EDTA containing tubes (Table 1 & Figure 1-2).The gDNA extraction and quality control of data was described in a previous study (Mustafa at el., 2014).Genotyping of selected samples was performed at USDA platform using Illumina Bovine high density (HD) SNPs bead chip (version 2) spanning 777, 962 SNPs markers across all bovine genome.200 ng gDNA quantities were used to genotyped these samples according to manufacture protocol.

Data analysis
Genotypic data were generated from the iScan system.The raw data analysis including genotyping calling, clustering and data normalization was performed by using genome studio version 1.9.0 software (Edea at el., 2015) .Pad and map.file was created for downstream analyses from the genome studio using PLINK (version 1. 9).Quality assurance module were used form SVS (version 8; Golden Helix Inc., USA) for genotypic statistics each markers were analyzed for call rate, Hardy-Weinberg equilibrium (HWE), minor allele frequency (MAF) and genotypes count.Quality control (QC) criteria for further analysis were < 95% call rate and <0.05 minor allele frequency (MAF).Hardy Weinberg equilibrium (P<0.001) was tested to help identify genotyping errors (Kim at el., 2015).

Minor Allele Frequency (MAF) Distribution
The minor allele frequency (MAF) was calculated and presented in Table 2 & figure 3 for each SNP from the generated data set.The analysis of 500,939 SNP markers indicate an average minor allele frequency (MAF) that is 0.23, 0.20, 0.22, 0.22, 0.20, 0.18, 0.20, 0.22, 0.21 and 0.18 for Achi, Bhagnari, Cholistani, Dhanni, Dajal, Kankraj, Lohani, Red sindi, Sahiwal and Tharparkar cattle, respectively.There was a significant difference observed among these selected breeds (p<0.001).The overall minor allele frequency (MAF) was observed in this study was higher than previous reported studies in indicine breeds (McKay at el., 2008;Edea et al., 2015;Kim at el., 2015) and lower than the average value reported for Red Chittagong that was 0.28 (Uzzaman at el., 2014).The lower average minor allele frequency (MAF) value is as expected than most of the Bos taurus cattle breeds (Mckay et al., 2008;Mustafa at el., 2014).The minor allele frequency (MAF) found in this study revealed that these attributes to different markers density (Illumina Bovine 8K, 10K, 50K, 80K and 700K) used in previous studies in different cattle breeds around the world and most of these breeds samples were not used before or during designing of these chips (Chen at el., 2010; Lin at el., 2010;Melka at el., 2011;Edea at el., 2014;Uzzaman at el., 2014 ).The SNP variation across all Pakistani cattle breeds was also examined.The SNPs minor allele frequency (MAFs) distribution at common variants (≥ 0.10 and ≤ 0.5) accounts is 64% (Table 3).Among these selected breeds, Dhanni cattle displayed high proportions of common variants (69%).The minor allele frequency (MAFs) variation at rare variant (>0 and <0.05) were observed 11% in overall breed samples.The higher proportion of alleles (fixed) in selected cattle populations indicate inbreeding that is due to uncontrolled breeding management in country (Groeneveld at el., 2010;Lin at el., 2010;Leroy, 2014).The high proportions of common variants were also reports in sheep that was 83 % (Kijas et al., 2009).The average minor allele frequency (MAF) distributions at ≥0.30 and ≤0.5 were displayed 32 % that is higher than previous reported polymorphism in cattle breeds (McKay at el., 2008;Kim at el., 2015).It is an established fact that higher proportions of minor allele frequency (MAFs) were observed in Bos taurus rather than Bos indicus using different Illumina bovine Bead chips due to limited numbers of indicus breeds were used during chip developments (Decker at el., 2014;Mustafa at el., 2014;Bickhart at el., 2016).The SNPs distribution at fixed level (0) was also examined and average 8% was observed among all these breeds.The highest SNPs proportion at fixed level was observed in Dhanni and Tharparkar (10%) and lower level in Bhagnari and Lohani (6%) cattle breeds, respectively.Across all sampled populations 64% SNPs markers were observed polymorphic (MAF>0.05)within breeds and remaining 36% were considered as monomorphic markers (Figure 5).The higher proportion of polymorphism among these breeds was showed in Dhanni breed (69%).the high proportion of SNP variation in this study was higher than previous reported SNP variation in different cattle breeds (Curik at el., 2014;Howard at el., 2015;Kim at el., 2015).Although, the results of SNP variations in this study revealed close similarity with the some previously reported variation in other farm animals including sheep and goat using genome wide SNP array (Kijas et al., 2009).The observed polymorphism in these selected breeds could explain that maximum bovine sequence data were available in the development of bead chip were from European cattle breeds (Bos taurus) (Gautier at el., 2010;Melka at el., 2011;Edea at el., 2014;Decker at el., 2014).

Genetic Diversity among Pakistani cattle breeds
The genomic variability within these cattle breeds were also examined and compare heterozygosity level between these breeds (Table 4).Across all these cattle breeds, the average observed (H o ) and expected (H E ) heterozygosity were 0.662 and 0.640, respectively.The average heterozygosity level was observed higher than the previous reported microsatellite markers analysis in some indicine cattle breeds (Hussain et al., 2016).Meanwhile, there is close agreement with previous reported values using SNPs in Brahman, Gir cattle and Nellore cattle (Dadi at el., 2012;Leroy, 2014;Pryce at el., 2014;Decker at el., 2014;Bickhart at el., 2016).
The F-statistics were also estimated within these selected breeds.Overall inbreeding within population (F IS ) value was estimated (0.073), where total inbreeding (F IT ) was 0.082.Genetic differentiation (F st ) was estimated at 0.076 (Mbole-Kariuki at el., 2014;Edea at el., 2015;Kim at el., 2015).Previously, in a study of Genetic characterization in Pakistani cattle population reported (inbreeding within population (F IS ) of 0.2819, F IT (total inbreeding) of 0.3864 and F st of 0.1456) using microsatellite makers (Hussain at el., 2016).The F st of cattle breeds in Pakistan was observed low as reported in some previous zebu cattle studies (Gautier at el., 2010;Groeneveld at el., 2010;Lin at el., 2010;Leroy at el., 2014) that may be due to common origin.

Conclusion
The results of this preliminary study of Bovine high density SNPs revealed that the distribution of SNPs markers across the bovine genome of native zebu breeds in Pakistan was significantly different and identified some level of polymorphism and minor allele frequency (MAF) rate among these breeds.The levels of SNPs variation in this study encourage future use of Bovine High Density SNP assay with great extent for genetic studies in these breeds.These results could be effectively used to understand breed composition and within breed diversity, which could be an attractive opportunity to allow this important genetic resource improvements through effective population selection strategy.

Table 1 . Animals sampling and geographical details
*Figure1showed complete agro-geographic location