MULTIVARIATE STATISTICAL ANALYSIS OF SOME TRAITS OF BREAD WHEAT FOR BREEDING UNDER RAINFED CONDITIONS

In order to evaluate several agro-morphological traits in 56 wheat genotypes, an experiment based on randomized complete block design with three replications was carried out. Principal component analysis (PCA) and factor analysis were used for understanding the data structure and trait relations. The PCA showed that five components explained 69% of the total variation among traits. The first PCA was assigned 28% and the second PCA was assigned 13% of total variation among traits. The first PCA was more related to grain number, floret number, tiller number, stem diameter, leaf width and spikelet number. Therefore, the selection may be done according to the first component and it was helpful for a good breeding program for development of high yielding cultivars. The correlation coefficient between any two traits is approximated by the cosine of the angle between their vectors in the plot of the first two PCAs and the most prominent relations were between grain diameter and grain yield; and between grain length and 1,000 seed weight. The factor analysis divided the eighteen traits into five factors and the first factor included stem diameter, leaf width, tiller number, spike length, floret number, spikelet number, grain number and grain yield. The second factor was composed of some morphological traits and indicated the importance of the grain diameter, grain length, 1,000 seed weight and grain yield. The two PCA and factor analysis methods were found to give complementary information, and therefore such knowledge would assist the plant breeders in making their selection. In other words, this data reduction would let the plant breeder reduce field costs required to obtain the genetic parameter estimates necessary to construct selection indices.


Introduction
Wheat (Triticum aestivum L.) has been a food of the major civilizations for 8,000 years and it is the most cultivated cereal crop in the world, feeding about 40% of world population.It was cultivated on an area of 7.0 million hectare during 2011-2012 with a production of 13.8 million tons (FAOSTAT, 2012).Wheat is the second major food crop of the world following rice, while in Iran the crop ranks first in terms of total production.It plays an appreciable role of supplying the production with carbohydrates, proteins and minerals in human diets (Schulthess et al., 2000;Stipešević et al., 2009;Deyong, 2011).For a successful wheat breeding program, the presence of genetic diversity plays a vital role which is essential to meet the diversified goals of plant improvement such as attempting to increase yield potential, adaptation, desirable quality, and resistance to biotic and abiotic stresses.Developing high yielding wheat cultivars is an important objective of breeding programs.Information on the structure of genetic diversity of wheat aids the plant breeder in choosing the diverse parents for hybridization (Samsuddin, 1985;Mohamed, 1999).Breeding high yielding wheat cultivars which are an important strategy to sustain yield production draws the plant breeders' attention.Indirect selection through traits related to grain yield is one of the most important strategies in wheat breeding.
While researching crops grown for grain yield, their yield components are marked and these are: spike density, number of grains per spike, and 1,000-seed weight.Grain yield of wheat is the integration of many traits that affect plant growth throughout the growing period (Leilah and Al-Khateeb, 2005;Protić et al., 2009).Each trait changes to a different extent and direction under the influence of environmental factors.The studied traits are usually correlated, and therefore it may be interesting to find general regularities in the interrelations that occur among them (Rymuza et al., 2012).Statistical analysis of agronomic traits could be informative in wheat breeding program, because agronomic traits are a reflection of gene effects.Different statistical procedures have been used in modelling crop yield, including principal component analysis and factor analysis.The principal component analysis is a multivariate statistical method for exploring and simplifying complex data sets.Each principal component is a linear combination of the original variables, and so it is often possible to ascribe the meaning to what the components represent (Lewis and Lisle, 1998).Factor analysis suggested by Walton (1972) has been used to identify growth and plant characters related to wheat (Leiah and Al-Khateeb, 2005).
For finding the regularities, multidimensional analyses are used, one of which is principal component analysis (PCA).PCA makes it possible to transform a given set of traits, which are correlated, into a new system of traits, known as principal components, which are not correlated.Collaku (1989) and Leilah and Al-Khateeb (2005) illustrated that number of tillers and 1,000-seed weight positively improved yield potential.The 1,000-seed weight was reported by many researchers as the variable most closely related to grain yield and was often used in selecting high yielding wheat cultivars (Deyong, 2011;Rymuza et al., 2012).The PCA is known by the fact that it includes the total variance of variables, describes maximum of variance within a data set, and is a function of primary traits.This approach seems to be effective in deciding which agronomic traits of crop contribute most to yield.The identified agronomic traits should be emphasized in the wheat breeding program.The selection of parents plays an important role in plant improvement program, as the genotype of progenies depends on the genotypes of parents.Attempts to develop ideal plant architecture of wheat have rarely been made.Briggs and Shebeski (1972) reported change in number of factors and yield related traits by the genotype × environment interaction and the first seven factors explained 90.7% of total variation.Mohamed (1999) found that two factors (grain yield and spike density) accounted for 80.8% of variation among traits in some bread wheat genotypes.Leilah and Al-Khateeb (2005) studied bread wheat genotypes and they showed that three factors including yield factor, biomass factor and harvest index factor accounted for 74.4% of total variation.One objective of this investigation was to clarify the association among some agronomic traits of bread wheat using PCA and factor analysis which provide valuable information for plant breeders who are interested in researching the agronomic traits of bread wheat and breeding new high yielding wheat cultivars in future.

Material and Methods
Fifty-six genotypes of bread wheat listed in Table 1 were assessed using a randomized complete block design with three replications during 2012-2013 growing season at the Research Farm of Agricultural Faculty, University of Maragheh (latitude 37°23'N, longitude 46°16'E, altitude 1,478 m).The experiment included 13 bread wheat cultivars which represent a range of phenotypic variation in maturity, adaptation zone, yield potential and date of release.The experimental region has a semi-arid climate characterized by relatively long winters.Minimum and maximum temperatures at the research station were -25°C and 40°C, respectively and the climate is characterized by mean annual precipitation of 378 mm, mean annual temperature of 18.3°C, and mean temperature of growing season of 33°C.Sowing was done by hand in plots with six rows 2.5 m long and 25 cm wide.
Tillage of all field plots was performed prior to sowing date and fertility was constrained by low organic matter and phosphorus contents.The fertilizer application was performed before sowing, 60 kg ha -1 of N, 30 kg ha -1 of P 2 O 5 and 20 kg ha -1 of K 2 O were broadcast on the surface and tilled into the soil and the weeds were controlled chemically.The number of days to flowering (DF) was recorded for each plot.Stem diameter (SD), plant height (PH), leaf number (LN), leaf length (LL), leaf width (LW), tiller number (TN), internode length (NL), peduncle length (PL), spike length (SL), floret number (FN), spikelet number (SN), grain number (GN), length of awn (AL), grain diameter (GD), and grain length (GL) were measured based on guarded plants which were randomly selected from each plot.Similarly, thousand seed weight (TS) and grain yield (GY) of each plot were measured.The grain yield per plot was measured by taking a net plot of 2 × 1 m or 2 m 2 .
Table 1.The name and code of 56 bread wheat genotypes.
The datasets were first tested for normality by Anderson and Darling normality test (Anderson and Darling, 1952) and then were subjected to analysis of variance using an appropriate model.The principal component analysis was used based on Everitt and Dunn (1992) and a plot of the first two principal components was drawn.The factor analysis method (Cattell, 1965) consists in the reduction of a large number of correlated variables to a much smaller number of clusters of variables called factors.After extraction, the matrix of factor loading was submitted to a varimax orthogonal rotation, as applied by Kaiser (1958).The array of communality, the amount of variance accounted by the common factors together, was estimated by the highest correlation coefficient in each array as suggested by Seiller and Stafford (1985).The experimental data were statistically analyzed for variance using the SPSS version 13.0 (SPSS Inc., Chicago, IL, USA) and MINITAB version 14 (Minitab Inc., Harrisburg, PA, USA).

Results and Discussion
In order to know with which combination type of agronomic traits the bread wheat would attain high grain yield PCA was performed (Table 2).The Scree plot of the PCA (Figure 1) shows that the first five eigenvalues correspond to the whole percentage of the variance in the dataset.The first five main PCAs are extracted from the complicated components, the total cumulative variance of these five factors amounted to 69.24% and these components had eigenvalues >1.The PCA simplifies the complex data by transforming the number of associated traits into a smaller number of variables as PCAs.The first PCA accounts for maximum variability in the data with respect to succeeding components.The PCA grouped the estimated bread wheat variables into five main components which PCA1 accounted for about 27.80% of the variation; PCA2 for 13.07%; PCA3 for 11.93%; PCA4 for 9.24% and PCA5 for 7.19% (Table 2).The first PCA was related to grain number and its contributing traits such as floret number and tiller number, whereas the second PCA was related to grain yield and its contributing traits such as grain diameter, grain length and thousand seed weight (Table 2).The traits, which contributed more positively to PCA1, were grain number, floret number, tiller number, stem diameter, leaf width and spikelet number suggesting that this component reflected the yield potential of each genotype through some yield component aspects.In addition, the traits, which contributed positively to PCA2, were grain diameter, grain length and thousand seed weight suggesting that this component reflected the yield potential of each genotype.The third PCA contrasts variables that are related solely to height properties such as plant height, internode length and peduncle length (Table 2).The traits, which contributed more positively to PCA3, were plant height, internode length and peduncle length suggesting that this component reflected the height characteristics of each genotype through some related components.The forth PCA was related to spike properties and its contributing traits such as length of awn, whereas the fifth PCA was related to number of days to flowering (Table 2).The traits, which contributed more positively to PCA4, were spike length, length of awn and leaf length suggesting that this component reflected the spike potential of each genotype.The first two principal components contributing about half of the variance were plotted to observe relationships between the measured traits of wheat (Figure 2).The correlation coefficient between any two traits is approximated by the cosine of the angle between their vectors (Yan and Rajcan, 2002;Dehghani et al., 2008).The correlation coefficients among the traits indicate that the plot currently shows the relationship among the traits that had relatively large loading on both PCA1 and PCA2 axes.The most prominent relations shown in Figure 2 are: a strong positive association between GD and GY; between GL and TS; between LL and AL; among GN, SL, LW and SD; among TN, SN and FN; as indicated by the small obtuse angles between their vectors (r=cos0=+1).There was a near zero correlation between TS and GL, between LL and AL, between DF and LN, and of DF with LL (Figure 2) as indicated by the near perpendicular vectors (r=cos90=0).There was a negative correlation between LN and LN, and between DF and TS (Figure 2) as indicated by the angle of approximately 180 degrees (r=cos180= -1).Some discrepancies of the plot predictions and original data were expected because the first two PCAs accounted for <100% of the total variation.The statistical properties of this interpretation have been described in detail by some researchers (Dehghani et al., 2008;Sabaghnia et al., 2011).Increased grain yield potential is an important goal for plant breeders and progress in yield potential results from the progressive accumulation of genes conferring higher yield or elimination of the unfavorable genes through the breeding process.The present investigation revealed that 1,000 seed weight and number of spikelets per spike had a strong relation with grain yield, suggesting the need for more emphasis on these components for increasing the grain yield in wheat.The PCA may allow the plant breeder more flexibility in finding the number of plants to be evaluated and the plant breeder could use the multivariate methods by first determining the combination of traits that constitute an ideal plant.By plotting the PCAs that are considered to be important, plants close to the ideal plant would be selected (Yan and Rajcan, 2002).The PCA may be deemed important if their associated coefficients are of relative magnitude with breeding targets and given this apparent potential for using PCA, further work is required to compare multivariate methods for reaching actual gains.
The factor analysis divided the eighteen traits into five groups or factors (Table 3) and the varimax orthogonal rotation was subjected to the matrix of factor loadings after the first extraction of factor loadings.This rotation accentuated the larger loadings in the extracted factors and suppressed the minor loadings thus improving the opportunity of achieving meaningful interpretation of factors.The factor which made the largest contribution accounted for 28% of the total variation and was composed of the some components of grain yield including stem diameter, leaf width, tiller number, spike length, floret number, spikelet number, grain number and grain yield (Table 3).Increasing floret number, spikelet number, and grain number would be the most effective way of increasing yield.The second factor, which accounted for 13% of the total variation, was composed of some of morphological traits and indicated the importance of the grain diameter, grain length, 1,000 seed weight and grain yield.It is clear that both of the first two factors which would express the combined effect of stem diameter, leaf width, tiller number, spike length, floret number, spikelet number, grain number, grain diameter, grain length, 1,000 seed weight were most closely associated with grain yield.Grain yield of bread wheat may be regarded as being composed of the components such as heads per plant, spikelets per spike, grains per spike and the 1,000 seed weight (Leilah and Al-Khateeb, 2005;Protić et al., 2009).
The third factor, which accounted for 12% of the total variation, was composed of leaf width, spikelet number and grain number (Table 3).The fourth factor, which accounted for 9% of the total variation, and the only morphological character included in this factor, was days to flowering time.The rate of development of the plant as reflected by the time of anthesis was important part of factor 4. The fifth factor, which accounted for 7% of the total variation, was composed of plant height, leaf number, spike length and length of awn.The factor analysis and orthogonal rotation, in addition to those included in this investigation, should be carefully considered when designing the comparison.Variations of factor analysis and factor rotation are explained in details by Leilah and Al-Khateeb (2005) and in making comparison, consideration should be given to costs involved in estimating genetic parameters of various crops (Dehghani et al., 2008;Hailegiorgis et al., 2011).Communality values of factor analysis for the measured traits of bread wheat are given in Table 3 and results indicated that PH, NL, SL, FN, GN, AL and TS traits had the highest communality and consequently the high relative contribution in wheat grain yield.To better understand the relationships among the measured traits of bread wheat, the relationships are graphically displayed in a plot of factor-1 and factor-2 (Figure 3).In this plot, the factor-1 axis mainly distinguishes the traits of LN (leaf number) and DF (days to flowering) as Group-A from the other traits.Grain yield (GY) groups near TS (1,000 seed weight), GL (grain length), GD (grain diameter) and PL (peduncle length) traits, and we refer to these as Group-B (Figure 3).In the plot of the first two factors (Figure 3), the factor-2 axis mainly distinguishes the mentioned traits as Group-B from the other traits.Grouping of traits by multivariate methods in the research is of practical value for the bread wheat breeders (Walton, 1971;Ajmal et al., 2013).Representative genotypes with good performance of identified traits may be chosen from the particular groups for hybridization programs with other approved genotypes.This strategy will help in identifying and combining favorable genotypes to obtain important traits in one genotype with a broad genetic base.
The other traits are grouped as: Group-C, including PH (plant height) and NL (internode length); Group-D, including LL (leaf length) and AL (length of awn); and Group-E, including TN (tiller number), SN (spikelet number), FN (floret number), SL (spike length), GN (grain number), LW (leaf width) and SD (stem diameter).Morphological traits of different crops have been used for estimation of genetic diversity and genetic improvement since they provide a simple way of quantifying genetic variation (Fufa et al., 2005).Analysis of genetic diversity in germplasm helps in classification of traits and identification of possible utility of different traits for breeding goals (Mohammadi and Prasanna, 2003).The PCA and factor analysis methods provided complementary ways of studying the data.For example, the PCA showed the initial importance of grain diameter which is identified when its scores were introduced into the component.Furthermore, the factor analysis indicated that stem diameter, leaf width, tiller number, spike length, floret number, spikelet number, grain number were important.While wheat grain yield is built up from its components, the components in turn must be derived from the activity of the other related traits.The factor analysis shows the relationship between grain yield components and the morphological traits (Hailegiorgis et al., 2011).Similarly, the factor analysis indicates which yield components were associated with which morphological trait.
Taken together, the PCA and factor analysis methods then indicate which component of grain yield was the most important under the environmental conditions prevailing for this research.The plant breeder would have useful information which would enable him to identify which morphological traits he should select in order to achieve high yield (Godshalk and Timothy, 1988).
In the present research, analysis of the diversification of wheat traits was carried out depending on the applied growth system which may to a large extent modify not only wheat traits, such as grain number, 1,000 seed weight, etc. but also the correlation of those traits (Sabo et al., 2002).Yield of wheat depended mostly on spikelet number, grain number, and 1,000 seed weight.Other investigations show, however, that the yield of wheat may be to a greater extent formed by the number of spike, number of grain and 1,000 seed weight (Protić et al., 2009;Rymuza et al., 2012).The applied method of PCA made it possible to fully assess the relations among wheat traits which are used for the analysis of observed diversity in regard to different traits.Westerlund et al. (1991) and Schung et al. (1993) used the PCA for the description of some wheat genotypes in regard to grain quality traits.The conducted PCA allowed the reduction from eighteen primary traits to four or five new variables as principal components, which can explain 69% of the variability of the primary data.The traits that form the first, second, and subsequent PCAs, according to Ajmal et al. (2013) indicate the strongest discriminatory power.In the conducted experiment, the strongest discriminatory power was shown by the 1,000 seed weight, number of grains and grain yield.

Conclusion
In this investigation, the increased 1,000 seed weight and number of grains together were the main components of yield.Under these circumstances, selection should be made for increased grain number.Similarly, a considerable spike length was important to obtaining higher grain yield and in general, a large leaf should result in a high yielding plant.The PCAs and factor analysis are statistical techniques that are useful for the description of the relations that occur among bread wheat characteristics.The obtained non-correlated traits may be used for further analysis, where the assumption of having no co-linearity problem of variables is needed.Reduction of several analyzed wheat characteristics to some PCAs and factors makes it possible to explain about 70% of the total input data variability.Characteristics of wheat may be described with the use of 1,000 seed weight and number of grain traits.Therefore, they could be considered in the development of desirable progenies in selection programs of wheat.

Figure 1 .
Figure 1.Scree plot showing eigenvalues in response to number of components for the estimated variables of wheat.

Figure 2 .
Figure 2. Plot of the first two PCAs showing relation among various bread wheat traits.For explanation of character symbols, see section Material and Methods.

Figure 3 .
Figure 3. Plot of the first two factors of factor analysis indicating relation among various bread wheat traits.For explanation of character symbols, see section Material and Methods.

Table 2 .
Loadings of PCA for the estimated traits of wheat.

Table 3 .
Rotated (varimax rotation) factor loadings and communalities for the estimated variables of wheat.