COMPARISON OF SINGLE NUCLEOTIDE POLYMORPHISMS AND MICROSATELLITES IN NON-INVASIVE GENETIC MONITORING OF A WOLF POPULATION

Single nucleotide polymorphisms (SNPs) which represent the most widespread source of sequence variation in genomes, are becoming a routine application in several fields such as forensics, ecology and conservation genetics. Their use, requiring short amplifications, may allow a more efficient genotyping of degraded DNA. We provide the first application of SNP genotyping in an Italian non-invasive genetic monitoring project of the wolf. We compared three different techniques for genotyping SNPs: pyrosequencing, SNaPshot® and TaqMan® Probe Assay in real-Time PCr. We successively genotyped nine SNPs using the TaqMan Probe Assay in 51 Italian wolves, 57 domestic dogs, 15 wolf x dog hybrids and 313 wolf scats collected in the northern Apennines. The obtained results were used to estimate genetic variability and PCr error rates in SNP genotyping protocols compared to standard microsatellite analysis. We evaluated the cost, laboratory effort and reliability of these different markers and discuss the possible future use of VeraCode, SNPlex and Fluidigm EP1 system in wild population monitoring.


INTrODuCTION
The wolf (Canis lupus Linnaeus 1758) is a top-level predator, protected by law in several European countries.After centuries of decline worldwide, wolf populations are now expanding in anthropic areas, where they predate both wild and domestic ungulates (Espuno et al. 2004;gazzola et al., 2008).Action plans have been designed, at national and European level, to recognize the major threats to the species' survival and to identify the conservation priorities and guidelines for wolf coexistence with humans (Boitani, 2000;Fritts et al., 2003).
In Italy, legal protection (1976) and active conservation efforts have led to a significant recovery of the species that is currently expanding in the Apennines and western Alps (Boitani, 1992;Poulle et al., 1999;Valière et al., 2003).Nevertheless the Italian wolf population is characterized by a low genetic variability (randi et al., 2000;Lucchini et al., 2004) which might threaten its long-term persistence.Furthermore, wolves are also expanding into anthropic areas where cross-breeding with dogs might compromise their genetic integrity (Boitani, 2000;genovesi, 2002;Vilà et al., 2003).Fundamental population parameters such as abundance, pack composition and dispersal rates are still poorly understood, thus this expanding wolf population needs to be carefully monitored (genovesi, 2002).However, wolves are very elusive and information on their distribution, demographic structure, variability and genetic identification are difficult to obtain directly.For these reasons, wolf-howling, snow tracking, fecal sample collection, diet analysis and, more recently, non-invasive genetics have become popular techniques employed in wolf studies (Caniglia et al., 2010a;Ciucci and Boitani, 2010;Fabbri et al., 2007;Jędrzejewski et al., 2005;Marucco et al., 2009;Scandura et al., 2011).
Non-invasive genetics is a combination of field, laboratory and analytical techniques that allow the study of the biology of natural populations, without observing or capturing individuals (Broquet et al., 2007).Microsatellites (STrs) have been the marker of choice in the last two decades, used to identify species and to detect individual genotypes from non-invasive samples (Taberlet and Luikart, 1999;Broquet et al., 2007).However, the DNA degradation of non-invasive samples, which leads to low amplification rates and genotyping errors, risks the generation of false genotypes, which do not correspond to any extant individual (Bonin et al., 2004).The most frequent genotyping errors, allelic dropout (ADO: one allele of a heterozygous individual is not amplified during a positive PCr) or false allele amplification (FA: artifacts resulting from slippage during the PCr cycles and misinterpreted as true alleles because they have the same characteristic shadow band profile) can lead to the number of individuals being overestimated (Broquet and Petit, 2004).To improve genotyping success and reliability, it is preferable to amplify DNA fragments that are as short as possible.The use of single nucleotide polymorphisms (SNPs) requires only the amplification of very short fragments and this makes them particularly suitable for non-invasive genetic monitoring projects (Seddon et al., 2005).
SNPs represent the most widespread source of sequence variation within genomes (Brumfield et al., 2003).They have emerged as valuable genetic markers in conservation genetics.SNPs are prevalently biallelic markers and are inherently less informative if compared to the multiallelic microsatellites when used for individual identification, parentage analysis and population genetics.However, their simpler mutational dynamics strongly reduces risks of homoplasy (Syvänen, 2001;Vignal et al., 2002;Brumfield et al., 2003;Chen and Sullivan, 2003).Furthermore, fast and inexpensive methods are available to screen hundreds or thousands of SNPs per sample per population (Chen and Sullivan, 2003;Ellegren, 2008;Wang et al., 2009).SNP genotypes, based on single nucleotide changes, are universally comparable and do not require standardization across detection platforms.In contrast, it is difficult to compare microsatellite data sets produced by different laboratories, due to inconsistencies in allele size calling and misinterpretation of the electropherograms (Vignal et al., 2003).In genetic monitoring projects involving carnivores like the wolf, brown bear and lynx, which have a huge dispersal capacity and widespread territories and that may cross country borders, collaborations between laboratories are necessary as suggested in the Action Plan for the wolf conservation in Europe (Boitani, 2000).Despite these advantages, the applications of SNPs in non-invasive monitoring projects to investigate ecological and conservation issues are limited (Sanchez and Endicott, 2006), and individual microsatellite genotyping from fecal samples still remains predominant.Many new technologies for SNP genotyping have been developed in the last few years and it can be difficult to choose an appropriate method for a given application (Chen and Sullivan, 2003), in particular for non-invasive DNA analysis.In this study, we have compared three different techniques for SNP genotyping of non-invasive DNA: Pyrosequencing (Biotage), SNaPshot ® (Applied Biosystem), and TaqMan ® Assay (Applied Biosystem) used in a real-Time PCr, evaluating the amplification success, PCr error (ADO, FA) rates and laboratory costs.Pyrosequencing is a non-electrophoretic realtime DNA sequencing method in which enzymatic reactions yield a detectable light, proportional to the number of incorporated nucleotides (ronaghi et al., 1998;Troell et al., 2003).SNaPshot TM is one of the most common commercial technologies based on minisequencing reaction and using fluorescent ddNTPs (Sobrino et al., 2005).Quantitative real-Time PCr can be used for the allelic detection of a single nucleotide polymorphic site using PCr assay probes labeled with different fluorescent reporter dyes, specific for each allele.
Pyrosequencing, SNaPshot TM and TaqMan ® Probe Assay are standardized methodologies used for high-throughput SNP analysis, and they have already been successfully applied to degraded DNA analysis for species identification (Moran et al., 2008;Morin and Mccathy, 2007), human forensic case resolution (Nilsson et al., 2006;Tschentscher et al. 2008), investigation of anthropogenic issues (Quintáns et al., 2004), and to evaluate DNA quality (Morin et al., 2000); thus they appear promising in SNP analysis from non-invasive DNA.
Although SNPs could replace STrs in population and conservation genetic studies which are usually based on problematic DNA samples, it remains unclear how many SNP markers will be required or what the optimal characteristics of these markers should be in order to obtain sufficient sta-tistical power to detect different levels of population differentiation (Morin et al., 2009).Therefore, we compared the efficiency and reliability of amplification, identification of individuals, and estimation of genetic diversity, of six STrs and nine SNPs, using data from a non-invasive monitoring project.Finally, we compared the performance of these markers in population genetic and structure analyzes and their capability in the detection of wolf x dog hybrids in the Italian wolf population.

Sample collection and DNA extraction
In this study we analyzed three sets of samples.Set one (I) includes DNA extracted from tissues collected mainly from the wolf source population in the central and southern Apennines, and from scat collected from recent expansion areas in the northern Apennines (Table 1).All wolf tissues were collected from carcasses which had the typical Italian wolf coat color pattern without any detectable morphological and genetic signals of hybridization with dogs (randi et al., 2000;randi and Lucchini, 2002).Scats were collected in the northern Apennines during a LIFE Table 1.Origin, size and type of the analyzed samples.Samples are regrouped in three sets for the performed analysis.The dog samples include both domestic, obtained from veterinary practice (n=26), and feral individuals, living sympatric with wolves (n=31).
§ Six STr loci are: FH2004, FH2088, FH2096 and FH2137 (Francisco et al., 1996), CPH2 and CPH8 (Fredholm and Wintero 1995).# Six SNP loci are: 1C06_138, 38K22_150, 96B17_422, 182B11_138, 309N24_298, 310M20_207 (Andersen et al., 2006).* Nine SNP loci are: 1C06_138, 38K22_150, 182B11_138, 309N24_298, 168J14_149, 218J14_ 81 (Andersen et al., 2006), 372M9_32, BLA22_199, BLB52_368 (obtained from Seddom et al., 2005 andSutter et al., 2004).and during an ongoing non-invasive wolf population monitoring project supported by Emilia-romagna region from 2002 to 2008 (Fig. 1) (Caniglia et al., 2010b).We have split the Italian wolves into three sub-groups according to their geographic origin: northern, central and southern Apennines.Although there is no obvious geographical break in the wolf distribution, we maintained this subdivision, aiming to separate groups of samples collected from central and southern areas, where the species survived during the bottleneck in the 1970s, and samples collected from the area of recent expansion in the northern Apennines (Fabbri et al., 2007).Moreover we analyzed 57 dog tissue and blood samples, collected from both domestic individuals of several breeds and ferals, in areas where they are sympatric with wolves.Furthermore, we analyzed 15 hybrid (wolf-dog) individuals; three of them were from captivity.They all showed anomalous phenotypic characters and had been previously identified as hybrid by genetic analysis (Lucchini et al., 2002).The samples of set I were analyzed in order to obtain information about the distribution of genetic variability of nine SNP markers in the Italian wolf population, in the dogs and hybrid individuals.Set two (II) includes 43 wolf scat samples collected during the monitoring project in Emilia-romagna, and belong to three different DNA quality categories previously assessed by genotyping them at six microsatellite loci (see also STr genotyping).The good quality category includes samples (n = 14) reliably genotyped at all loci; the medium quality category includes samples (n = 14) reliably genotyped at 50% of loci, while the low quality category includes samples (n = 15) reliably genotyped at less than 50% of loci.The reliability was assessed by reliotype software (Miller et al., 2002).This dataset was used to compare the efficiency of the three different SNP genotyping methodologies: Pyrosequencing, SNaPShot ® and TaqMan ® Assay.Finally, we randomly selected another 46 scat samples from non-invasive DNA (set III) to compare the performance of nine SNPs and six STrs in individual genotyping.The DNA quality of the samples was pre-screened by PCr amplifying at two microsatellite loci (for protocol see below).In each case, total DNA from tissue and blood samples was extracted using a guanidine thiocyanate and silica protocol (gerloff et al., 1995), while DNA from fecal samples was obtained through the Qiagen DNeasy 96 Blood & Tissue Kit and a robotic platform: MultiProbe II EX Liquid Handling System.Scat samples and DNAs were processed in separate rooms to avoid contamination.One negative control sample (no DNA) was added to each PCr or laboratory technique procedure.

SNP discovery and genotyping
We analyzed 139 candidate loci in 10 dogs and 14-20 Italian wolf tissue samples, by resequencing DNA regions containing SNPs in dogs (guyon et al., 2003;Sutter et al., 2004;Seddon et al., 2005).We discovered 53 sequences showing from one to five polymorphic SNPs, which led us to identify a total of 106 SNPs in the Italian wolf population (Andersen et al., 2006).PCrs were performed using primers and protocols as described in guyon et al. ( 2003), Sutter et al. (2004) and Seddon et al. (2005).PCr products were purified using Exo-Sap (Amersham) and sequenced in both directions using a 3130xl genetic Analyzer (Applied Biosystems).Sequences were analyzed and aligned using Seqsecape v. 2.5 (Applied Biosystems) and BioEdit v.7.0.1 (Hall, 1999).
We compared the genotyping performance of three different allelic discrimination techniques: SNaPshot ® Kit Analysis, Pyrosequencing, and Taq-Man ® Assay, choosing six unlinked SNPs from the 106 SNPs previously detected: 1C06_138, 38K22_150, 96B17_422, 182B11_138, 309N24_298, 310M20_207 (Andersen et al., 2006).Pyrosequencing technology uses an enzyme-cascade system consisting of four enzymes and specific substrates to produce light whenever a nucleotide is incorporated to form a base pair with the complementary base in a DNA template strand.The amount of light is proportional to the number of incorporated nucleotides.To obtain the single-strand DNA necessary for the pyrosequencing, either the forward or reverse primer was biotinylated for immobilization of PCr products using Vacuum Prep Tool (Biotage).Primers suitable for pyrosequencing were designed by Assay design software v. 1.0.6 (Biotage).The single-strand PCr products were pyrosequenced by the PSQ 96MA System (Biotage).Protocol details are available in Anderson et al. (2006).
The SNaPshot ® Multiplex kit can investigate up to ten SNPs simultaneously by employing PCr followed by dideoxy single-base extension of an unlabeled primer.The primer is designed to anneal to the sequence adjacent to the SNP site.To analyze more SNPs simultaneously it may be necessary to add a non-annealing tail to a primer to make its length sufficiently different from other primers to prevent the SNP markers from overlapping (ABI prism protocol).The first PCr was performed in 10 μl using the same primers designed for pyrosequencing analysis.PCr products were purified by Exo-Sap (Amersham).The second PCr was performed in multiplex using 1 μl of PCr cleaned product, 1 μl of SNaPshot ® mix (Applied Biosystem), 0.2 μl of each extension primer 1 μMol, bi-distillate water until 10 μl and the following thermal cycle condition: 25 cycles of 96°C for 10 s, 50°C for 5 s, 60°C for 30 s.The PCr products were analyzed by a 3130xl ABI Automatic Sequencer (Applied Biosystem) using the geneScan ® -120 LIZ (Applied Biosystem) as marker ladder and the ABI software geneMapper v. 4.0 for the allele analysis.
real-Time PCr is used to detect the end-point fluorescence by virtue of the presence of two probes that differ at the polymorphic site, and the fluorescent dye attached to the 5' end.During the PCr annealing step, the TaqMan probes hybridize to the target DNA, and in the extension step, the fluorescent dye (in 5' position) is cleaved by the 5' nuclease activity of the Taq polymerase, leading to an increase in fluorescence (Sobrino et al. 2005).Primers and probes were designed using Custom Taqman ® SNP genotyping Assay Service by Applied Biosystem.The amplification reactions were performed by the 7500 Fast System real-Time PCr (Applied Biosystem) in 5 μl of final volume using TaqMan universal Fast Master Mix (Applied Biosystem).Primer and probe concentrations, PCr cycles and conditions were performed according to manufacturer's instructions.The detected polymorphisms were analyzed using the 7500 Fast System SDS Software.Primer sequences will be provided upon request.

STR genotyping
Individual identification was performed by PCr amplification of each DNA at six microsatellite loci: four tetranucleotide (FH2004, FH2088, FH2096 and FH2137; Francisco et al. 1996), and two dinucleotide (CPH2 and CPH8; Fredholm and Wintero, 1995).These microsatellite loci were selected for being polymorphic in Italian wolves (Lucchini et al., 2004;randi and Lucchini, 2002) and for their reliability in non-invasive DNA analysis (Caniglia et al., 2010b;Fabbri et al., 2007).In fact, tetranucleotide repeats are known in general to be less variable than dinucleotides but the result is more stable, with a clearer shadow profile, especially in the heterozygote genotypes where the two alleles differed only for one repeat unit.In order to check for genotyping errors due to ADO or FA amplification, we analyzed all non-invasive samples following a multi-tube protocol (gagneux et al., 1997;Taberlet et al., 1996).This foresees from four to eight replicates for each locus/sample and a reliability analysis of multilocus genotypes using the software reliotype.The quality of the DNA samples was initially screened by the amplification of two STr loci, retaining only samples showing more than 50% positive amplifications (Caniglia et al., 2010b).

SNP and STR genotyping error analysis
genotypes obtained by amplifying non-invasive DNAs from sample set III at six STrs and nine SNPs were used to evaluate genotyping errors.The panel of six STrs includes the same markers used for individual identification previously described, while the panel of nine SNPs includes four SNPs already used in the comparison among Pyrosequencing, SNaPshot ® and TaqMan ® Assay (1C06_138, 38K22_150, 182B11_138, 309N24_298) and the other five (168J14_149, 218J14_ 81, 372M9_32, BLA22_199, BLB52_368) (Table 1).
The software gimlet v. 1.3.3(Valière, 2001) was used to estimate error rates in individual genotyping: ADO, FA, and successful PCrs.gimlet allows the user to construct consensus genotypes from a set of PCr repetitions for each sample and to calculate the error rates comparing the repeated genotypes and their consensus.
We used the software reliotype to evaluate the reliability (r) of the multilocus genotypes.reliotype is a program for assessing how reliable an observed multilocus genotype is, using a maximum likelihood approach.The software estimates the dropout probability considering allele frequencies, assuming that false alleles do not occur or can be removed from the data (Miller et al., 2002).

SNP and STR variability analysis
Variability analysis was performed using the software genAlEx v. 6.1 (Peakall and Smouse, 2006).
The number of different alleles (Na), Shannon's Information Index (I), observed (Ho) and expected (He) heterozygosity, unbiased Expected Heterozygosity (uHe), Fixation Index (F) and Principal Coordinate Analysis (PCA) were calculated separately for the Italian wolves (including tissue and scat samples) and dogs, and for SNPs and STrs.We further estimated the Hardy-Weinberg probability of identity (PI) for an increasing number of loci, i.e. the probability that different individuals by chance share an identical genotype.Moreover, as wolves in the same pack are known to be partially related (Mech and Boitani, 2003), i.e. sharing alleles which are identical by descent, we evaluated the probability of identity between sibs (PIsibs; Waits et al., 2001).To determine the minimum number of loci needed for genetic tagging we estimated the number of matches between wolf genotypes (command Matches in the option Multilocus from the genAlEx menu) and simulated a dataset of 318 individuals for biallelic codominant markers.
To evaluate the SNP power in the population identification and to assign the individuals to detected populations, we used the software Structure v. 2.3 (Falush et al. 2003).We performed four independent runs for each K cluster using the Admixture (each individual may have ancestry in more than one parental population) and LOCPrIOr (sampling locations are used as prior information to assist the clustering and improve the clustering for data sets with few markers, few individuals or very weak structure as suggested by Hubisz et al. (2009) models with Independent Allele Frequencies (I-model).We set the following run parameters: 200,000 MCMC discarding the first 20,000 (considered as burn-in period), according to other studies that used 10 4 and 10 5 respectively of burn-in period and MCMC iterations (Oliveira et al., 2007;Verardi et al., 2006).As suggested by Structure's authors, we ensured that the values of summary statistics that are printed out by the program (e.g.α, F, Likelihood) converged.The number of clusters was estimated assuming uniform prior values on K between 1 and 5. To detect the true K that better described our data we used the statistic ΔK that identifies the greatest rate of increase in the data posterior probability, Ln P(D), between each successive K (Evanno et al. 2005).Then we estimated the membership proportion (Q) of populations into the detected clusters, and the individual membership proportion q.

SNP allele detection: comparison of methodologies
We tested the efficiency of SNP genotyping from noninvasive DNA using three standard methodologies: pyrosequencing, the SNaPShot ® and TaqMan ® Assays.We selected six unlinked SNPs from a previous study (Anderson et al., 2006) and 43 fecal samples of different quality tested by STr amplifications.The rates of positive PCrs and ADO were calculated by gimlet v1.3.3 using three replicate PCrs per locus per sample.The SNaPShot ® methodology showed the highest positive PCr rate but only an intermediate ADO rate.TaqMan ® assay methodology showed the lowest ADO rate but intermediate amplification rates.Pyrosequencing showed a high dropout rate and low rate of positive amplifications (Table 2).We estimated the cost for a single SNP genotyping per 96 samples (corresponding to a 96 well plate) and the necessary time for laboratory work (Table 2).using the SNaPShot ® methodology, more SNPs can be easily multiplexed, but this involves two amplifications, an Exo-Sap purification and an electrophoresis, resulting in the longest procedure of the three methodologies.Taq-Man ® Assay by real-Time PCr is the fastest of the three procedures, requiring only a single PCr, but it is impossible to multiplex more loci.Pyrosequencing is both a time-consuming and expensive procedure.Having evaluated the amplification success, allelic dropout, costs and working time, we definitely preferred to use real-Time PCr to analyze SNPs in non-invasive DNA by an end-point fluorescent experiment like that described above.

PCR success and genotyping error rate in non-invasive samples
For evaluating the use of SNP versus STr markers in non-invasive samples, we analyzed 46 scat samples that showed more than 50% positive amplifications in four replicates per two STr loci in a total at least four out of eight positive PCrs.Fig. 2 shows the average of positive PCrs and genotyping errors (ADO, FA) calculated in nine SNPs and six STrs using four independent PCrs per sample/locus.The percentage of positive PCrs was high for both STr (from 0.72 to 0.97) and SNP (from 0.86 to 0.92) markers.genotyping errors occurred more frequently in STr (ADO: from 0.079 to 0.359; FA: from 0 to 0.1) than in SNP (ADO: from 0 to 0.18; FA: 0) amplifications (Fig. 2).reliability estimates by reliotype showed a slightly higher level for SNPs than STrs (Fig. 2).The differences of ADO, FA and reliability estimates between SNPs and STrs were at the limit of significance for ADO and FA but were not significant for r. using a t-test for paired data, the p-values were as follows: 0.0472, 0.0574, 0.702 for ADO, FA and r, respectively.

Genetic variability and individual identification
We determined the individual genotypes at six STr and nine SNP loci in 318 Italian wolves, 57 domes-tic dogs and 15 wolf x dog hybrids (Table 1).We previously selected the STr and SNP loci for their polymorphism in the Italian wolf population and for maximizing the difference for species identification between wolves and dogs based on fecal samples.All microsatellites were polymorphic in all populations, showing 3-11 alleles per locus in wolves and 5-17 alleles per locus in dogs.Values of Ho ranged from 0.59 to 0.68 and He from 0.60 to 0.75 (Table 3).All SNP loci were polymorphic in the wolf sub-populations but not in dogs where two SNPs displayed monomorphism.This is not surprising because we chose SNP polymorphic prevalently in wolf samples or fixed between wolf and dog.SNPs showed lower values of heterozygosity than microsatellites (Ho = 0.16-0.44;He = 0.20-0.44).The probability of identity for increasing locus combination was lower in the 9 SNPs than in the 6 STrs.Considering the 9 SNPS, this was 2.3E-02 in dogs and ranged from 2.0E-03 to 2.7E-04 in wolves.PIsibs obtained with the 9 SNPs ranged from 3.9E-02 to 1.4E-02 in wolves, indicating that 1.4 wolves among 100 sibs are expected to share, by chance, an identical genotype with another wolf (Table 3).In Fig. 3 we calculated the PI and PIsibs values, increasing the locus combination from 1 to 12 loci.results showed that 12 SNPs are necessary to have PI and PIsibs values comparable with the six   PCA results with SNP and STr genotypes are shown in Fig. 5. Individual scores are plotted on two principal component axes (PC-I and PC-II), which cumulatively explain 47.8% and 45.3% (for SNPs and STr respectively) of the total genetic diversity.The plot shows a separation between wolves and dogs, but it is not so clear as results from previous studies involving more loci (18 STrs, randi andLucchini, 2002, Lucchini et al., 2004).
A clustering test was performed by Structure v. 2.3 and the membership proportion (Q) of populations in the detected clusters is shown in Table 4.The number of clusters K that maximized the increase in the posterior probability of the data LnP(D) (garnier et al., 2004) visualized as ΔK (second order rate of change of likelihood function with respect to K, Evanno et al. 2005), was two both for STr and SNP loci.using STr loci, the wolf population was assigned to cluster I with a Q of 0.998, dog samples to cluster II with a Q of 0.979 and hybrid individuals showed an intermediate assignation to both clusters, as expected (Table 4).After checking the individual membership proportion qi, wolves were assigned to cluster I with a qi average value (on four runs) ranging from 0.978 to 0.999.using nine SNPs, wolves showed a lower population membership value (Q = 0.820) to cluster I, dog samples were assigned to cluster II with a Q of 0.997, while hybrid individuals showed intermediate Q values to the two clusters.Looking at the individual membership proportion qi, wolves were assigned to cluster I with average values ranging from 0.643 to 0.819 (Table 4).

DISCuSSION AND CONCLuSION
Non-invasive genetic methods have previously found several applications in population biology, ecology and conservation for a wide number of species (Waits and Paetkau, 2005).However, the low quality of DNA with consequent genotyping errors and low amplification success tend to limit the efficiency and the reliability of the obtained results.If compared to microsatellite markers, the application of SNPs requires shorter amplifications and thus offers potentially high-throughput, which makes them particu-larly promising in the analysis of non-invasive DNA.The main obstacle for the use of SNPs is the difficulty of their identification in non-model organisms (garvin et al., 2010; ryynänen et al. 2007).Identifying polymorphic and informative SNPs in small and Table 3. Estimates of genetic variability at nine SNP and six STr loci, computed using the software genAlEx v. 6.1 (Peakal and Smouse 2006).Mean number of different alleles (Na), Shannon's Information Index (I), Observed (Ho) and Expected Heterozygosity (He), unbiased Expected Heterozygosity (uHe), Fixation Index (F), Probability of Identity for unrelated individuals (PI) and full sibs (PIsibs).Standard Errors (SE) are in parentheses.endangered populations is difficult.Checking single sequences has proved to be expensive and requires a considerable laboratory effort.In this study we utilized information from BAC clones that are available from the domestic dog (Canis familiaris).We chose 139 polymorphic sequences used in previous studies regarding dog breeds (guyon et al., 2003;Sutter et al., 2004) and endangered wolf populations (Seddon et al., 2005).We tested 139 primer pairs, but only 43% of them showed polymorphic SNPs in the analyzed wolf samples (Table 5), with an average density of one SNP every 470 base.Seddon et al. (2005) identified 25 variable fragments in Scandinavian wolves out of 40 sequenced, with an average density of one SNP every 306 base.Identification of polymorphic and informative SNPs in small or differentiated populations is difficult, even when polymorphic sequences for closely related species are available.
Furthermore, not all SNPs detected in the sequences are in specific positions suitable for primer and probe design.In fact, some requirements are necessary.Applied Biosystem and Biotage distribute software for these purposes.However, sometimes the primer design fails or the amplification does not show optimal performances.For example, in our study, out of 27 polymorphisms tested for primer/ probe Assay designed to be used with TaqMan Assay protocol, seven failed and nine showed problems in the analysis of allelic discrimination.To have a huge number of useful SNPs it is a necessary to establish a start point in view of the difficulty in primer design used in SNP genotyping.Next-generation sequencers are able to rapidly generate millions of DNA sequences at reduced costs.The new generation of genome sequencing can produce very huge contig.sequences that can be applied to non-target organisms and to investigate several biological questions.Contrarily, DNA microarrays or chips can produce a screen of hundreds of thousands of bp containing SNPs.For example, the Broad Institute has developed a custom canine SNP array in collaboration with Affimetrix.The goal of this project was the realization of a SNP array useful in many dog breeds to perform genome-wide association mapping using at least 15,000 SNPs.
Several studies have focused on comparing the advantages and disadvantages of STr and SNP markers for various genetic issues in humans (Bailey-Wilson et al., 2005) and in non-model organisms (Coates et al., 2009;ryynänen et al., 2007;Seddon et al., 2005), but not a single study has investigated the application of SNPs in non-invasive samples and evaluated the efficiency of the different techniques available for allele detection in non-invasive monitoring projects.In fact, a great variety of different SNP genotyping protocols are available for researchers, but it is necessary to take into account different aspects, like sensitivity, reproducibility, cost, level of throughput, to determine which technology is the most suitable for non-invasive genetic monitoring purpose (Sobrino et al., 2005).
We compared the reliability of allele detection of three common genotyping techniques: Pyrosequencing, SNaPShot ® and TaqMan ® Assay, using noninvasive DNA.TaqMan ® Assay was revealed to be the most reliable in allele detection (like in other studies, Li et al., 2010), even though it showed a lower than expected success rate of amplification, which may be due to the presence of contaminants in non-invasive DNA.Kontanis and reed (2006) showed that tannins and other oligomeric compounds with free phenolic groups can inhibit the TaqMan ® Assay.Our results could be improved by testing different DNA purification or enriching methods before the amplification process.A further strategy might be the dilution of DNAs to a threshold concentration at which inhibitors are ineffective but contain a sufficiently small number of template molecules required to generate reliable PCr products.However this operation could increase the risk of no amplification products or allelic dropouts in low content DNA samples such as the non-invasive ones.
The use of SNPs instead of STrs in non-invasive projects has several potential advantages such as a better amplification power, lower genotyping errors (ADO, FA) and unambiguous allele comparisons, but SNPs are less polymorphic and suitable for the individual discrimination (implying a higher risk of "shadow effect"; Mills et al., 2000) and population clustering.To obtain a PI from SNP markers comparable with six STrs we need at least 12 SNPs.In fact nine SNPs are not enough to distinguish individuals in the Italian wolf population: among the 318 analyzed genotypes we found 65 matches (Fig. 4).Simulating a dataset of biallelic codominant markers with uHe = 0.5 and 318 individuals by genAlEx, we found that 12 biallelic markers are necessary to distinguish all individuals.The observed and estimated expected variability is lower in the nine SNPs than in the six microsatellites.ryynänen et al. ( 2007) found a significant correlation between the estimated heterozygosity of six biallelic markers and 14 microsatellites in two salmon populations.Our results suggest that nine polymorphic SNPs provide useful information of the general level of genetic diversity but they are not powerful enough for individual detection or clustering analysis; in fact SNPs have lower mutation rate per generation (10 -8 -10 -9 ) than STrs (10 -4 ) (Brumfield et al., 2003).results from Structure show that nine SNPs are enough to distinguish between wolves, dogs and hybrids running the program with the LOCPrIOr model that uses prior population information and that is more for performing datasets with few markers, but not as much as six STr markers.
Our work confirms that in non-invasive projects and in population genetic studies either a higher number of SNPs or an integrative combination of SNPs/STrs are needed.For example, adding three STrs (choosing among the most polymorphic and reliable) to the nine SNP dataset, all the 318 wolves show a unique genotype (Fig. 4).On the other hand, adding STr markers to the SNP dataset increased the error rate (Fig. 2).
In technologies based on hybridization and fluorescent detection there is the advantage of avoiding further manipulation steps using a single PCr reaction, but there is a limited multiplexing capability.High-throughput equipment for TaqMan Assay is available and can be tested on degraded DNA.For example, the Fluidigm EP1 system for genetic analysis enables high-yield SNP genotyping through integrated fluidic circuits (IFCs) known as dynamic ar-rays.There are currently two types of dynamic arrays offered for genotyping applications: 96.96 and 48.48.The first enables the testing of 96 SNPs against 96 samples in a single run, the second, 48 SNPs and 48 samples (Wang et al. 2009).LightCycler ® 1536 System (roche) allows analysis of up to 1536 samples using reduced reaction volumes.This system is designed to generate basic PCr results in mono-and dual-color assay formats for gene detection, gene expression, and genetic variation analysis.
A subset of very informative SNPs (48 or 96) could be genotyped using relatively cheap technologies such as, for example, VeraCode or Fluidigm EP1 System.The costs of SNP microchip bead application is gradually decreasing, making it available to a wider range of users.The planned costs of a panel of 50 loci using VeraCode SNPs, SNPlex or Fluidigm EP1 system would be less than 10 Euros, which is 1/20 of the amount necessary for genotyping 12 microsatellite loci with four PCr replicates.However, both these values were estimated excluding salary costs, as well as primer and PCr optimization costs.
These new techniques for high-throughput SNP genotyping could be used in non-invasive DNA monitoring projects that usually require a huge number of samples to be analyzed.However, we suggest evaluating carefully the effects that DNA degradation could have when these new technologies, developed for high quality DNA, are used.
Funding -This work was supported by the Italian Ministry of Environment.The Emilia romagna region supported the collection and genetic analysis of non-invasive samples.Laboratory work was carried out entirely at the genetic laboratory of ISPrA site in Ozzano dell' Emilia (Bologna, Italy).This study was partly supported by a Marie Curie Transfer of Knowledge Fellowship BIOrESC of European Community's Sixth Framework Program (contract number MTKD- CT-2005-029957).Furthermore, we wish to thank the Congen program (funded by the European Science Foundation) and the Danish Natural Science research Council for financial support to CP.

Fig. 1 .
Fig. 1.Study area: Italian Peninsula.The dark small circles represent the non-invasive samples, white circles the invasive samples (tissues) and stars the hybrid (wolf x dog) samples.The Emiliaromagna (EMr) border region and the subdivision in Northern (NAp), Centre (CAp), Southern Apennines (SAp) are given.

Fig. 2 .
Fig. 2. Histogram showing the mean percentage of positive amplifications (% + PCr), allelic dropout (ADO), false amplification (FA) and reliability score (r).Black, light grey and dark grey bars represent the results for six STrs, nine SNPs and nine SNPs + three STrs respectively.

Fig. 4 .
Fig. 4. Number of unique genotypes in 318 individuals from the Italian wolf population by locus combination from one to 12 markers.Line with dark dots represent the 9 SNPs, line with white dots 6 STrs and line with triangles mix markers (9SNPs + 3STrs).

Fig. 3 .
Fig. 3. PI (white symbols) and PIsibs (dark symbols) values for increasing locus combination in SNP (lines with spot) and STr (lines with triangles) markers.

Fig. 5 .
Fig. 5. Principal component analysis (PCA) of 318 wolf and 57 dog genotypes determined using nine SNPs (A), or six STrs (B), and performed using the software genAlEx v. 6.1 (Peakal and Smouse 2006).Dark dots represent dog samples, white dots Northern Apennine wolf samples, dark triangles Central Apennine wolf samples and white triangles Southern Apennine wolf samples.

Table 2 .
Comparison of Pyrosequencing, SNaPShot ® and TaqMan ® Assay for SNP genotyping in non-invasive DNA.Percentage of positive PCrs on the total of 774 reactions, percentage of ADO, percentage of ADO calculated in samples with more than 50% of positive PCrs, estimated using software gimlet v. 1.3.3.Number of PCr per locus per sample necessary to identify a reliable genotype; estimated costs per SNP per plate (96 samples) and needed working time.

Table 4 .
results from Structure analysis: average on four runs of membership proportion (Q) and 90% CIs in parenthesis of populations into the detected clusters obtained by Admixture and I models whit K = 2.

Table 5 .
rate of SNP identification from sequencing.Number of sequenced loci for SNP identification (tested primer pairs and reliable sequences).Number of identified polymorphic SNPs.Number of primer/probe for SNP detection tested by rT PCr.