The Impact of Education on Wage Determination between Workers in Southern and Central-Northern Italy

The aim of this paper is to examine the earnings dynamic in Italy, in order to explain earnings differences between southern Italy and centralnorthern Italy. In our analysis we use different techniques: ordinary least squares (OLS), quantile regression models and the algorithm developed by Machado and Mata (2005). In particular, the Machado and Mata (2005) algorithm allows us to examine the relative importance of both differences in workers’ characteristics and in their returns in explaining southern, central and northern Italy earnings differences at a point in time, as well as across time within each macro-area. We focus on the role of differences in educational endowment and returns to education, one of the most important components of human capital in the stylised literature. The level of education determines the substantial disparities in terms of wage returns. However, this holds only for levels of education related to compulsory education.

Education is widely recognized as a crucial ingredient for both development and social cohesion.Evaluation of human resources based on education levels attained by a disaggregated population and/or labour force for macro-areas enables us to quantify the spread of education and, especially, to capture important differences at the regional level.
Evaluation of returns to education can be read as an indicator of the characteristics of the labour market in relation to the presence of employability for more educated workers.An economic system of production with a strong demand for educated and skilled workers should be distinguished by a higher return on education, and a consequent income gap between highly educated workers and those attaining a lower educational level.By contrast, a lower return on education could signal a less intense demand for better educated employees on the part of the local labour market, or it might indicate an oversupply of educated workers compared with real opportunities, thus highlighting difficulties in the inclusion of skilled workers from the labour market.This situation could lead to the so-called brain drain or the migration of more educated individuals abroad, or at least towards markets with greater capacity for absorption.
On the basis of these observations our aim is to examine the earnings dynamic in Italy, in order to explain earnings differences between southern Italy and centralnorthern Italy.More specifically, by using earnings data for the two areas, and a common analytical framework, we address an empirical issue in this paper: the extent to which differences in average earnings in southern and central-northern Italy can be explained by differences in endowment and returns to education and other observed factors affecting earnings.
The paper is structured as follows.Section 1 discusses some background literature and highlights the paper's aims, Section 2 briefly introduces the Italian educational system, Section 3 describes the data and their descriptive statistics, and Section 4 presents and discusses earnings estimations, and returns to education in particular.Section 5 lists the results of the decomposition analysis and Section 6 concludes.

Background Literature and Objective
Mincerian wage regression is one of the most widely used tools of empirical economics, and has been applied to many areas of labour economics.The Mincer wage equation plays a central role in the literature devoted to returns to education as well as in the literature on wage inequality.It is also used to investigate statistical discrimination, gender differences in wages and occupational choices (Colm Harmon, Vincent Hogan, and Ian Walker 2003;Joop Hartog, Hans Van Ophem, and Simona Maria Bajdechi 2004; James J. Heckman, Lance J. Lochner, and Petra E. Todd 2005; Charlotte Christiansen, Juanna Schröter Joensen, and Helena Skyt Nielsen 2006;Miroslav Verbič and Franc Kuzmin 2009;Corrado Andini 2011a, b, c, 2013a, b, c).
An excellent synthesis of research papers adopting the Mincer equation has been provided by David Card (1999).The reviewed works generally focus on estimating the average impact of schooling on earnings by means of both ordinary least squares and instrumental variable techniques.Starting from the seminal work by Moshe Buchinsky (1994), the last few years have seen the publication of numerous estimates of the schooling-coefficient along the conditional wage distribution, with the frequent finding that education has a positive impact on within-group wage inequality, as suggested by Pedro S. Martins and Pedro T. Pereira (2004) and Andini (2007a, b).
In this work, we use quantile regression models that are more suitable than ordinary least squares (OLS) models for those countries where there is significant heterogeneity within the labour force, in terms of both earnings and impact of individual characteristics on earnings.Subsequently, we use the (decomposition) algorithm developed by José A. F. Machado and José Mata (2005) in order to examine the relative importance of differences in workers' characteristics and in their returns in explaining southern, central and northern Italy earnings differences at a point in time, as well as across time within each macro-area.The use of Machado-Mata methodology allows us to study the coefficient effect at each quantile and to account for heterogeneity in returns to individual characteristics (see Heckman and Xuesong Li 2004) as well heterogeneity in the characteristics themselves across the earnings distribution.In particular, we focus on the role of differences in endowment of returns to education, which is, perhaps, the most important component of human capital in the stylised literature.

The Italian Educational System. A Short Description
Education in Italy is compulsory from the age of 6 to the age of 16, and is divided into four stages: primary or elementary school (scuola primaria or scuola elementare), lower secondary school or middle school (scuola secondaria di primo grado or scuola media), upper secondary school or high school (scuola secondaria di secondo grado or scuola superiore), university (università) and post-university (PhD and Master's degree).Italy has both public and private education systems.
Primary school, which lasts five years, is commonly preceded by three years of non-compulsory nursery school (known as asilo).Until middle school, the educational curriculum is the same for all pupils: although one can attend a private or statefunded school, the subjects studied are the same (with the exception of special schools for the blind or the hearing-impaired).The students are given a basic education in Italian, English, mathematics, natural sciences, history, geography, social studies, physical education and visual arts and music.
Secondary education in Italy lasts eight years and is divided into two stages: scuola secondaria di primo grado (lower secondary school), also broadly known as scuola media, and scuola secondaria di secondo grado (upper secondary school or high school), also broadly known as scuola superiore.Lower secondary school lasts three years (roughly from age 11 to 13), whilst high school lasts five years (roughly from age 14 to 19).At the end of the final year of high school is an exam known as esame di maturità.A score of 60% is required to pass this exam and obtain a high school diploma, with the option of gaining access to university education.
Italy has an extensive international network of state-run and private universities and colleges offering degrees in higher education.State-run universities account for the majority of higher education institutes in Italy, and are managed under the supervision of Italy's Ministry of Education.Italian universities also offer officially recognized titles such as PhDs, and they organise Master's degree courses.

Data Description
In this paper the analysis draws on data from the Bank of Italy Survey on Household Income and Wealth (SHIW).This is a survey conducted every two years that collects information on the economic behaviour of Italian households at the microeconomic level.We focus on 1987, 1995 and 2006 because we want to consider the dynamics of wages in the first year of data availability, in the middle year and in the last one.The choice of 2006 as the last year is guided by the desire to have homogeneous data in order to compare the results obtained.In particular, the university reform, which came into force under Legislative Decree 270/20041 , provided for university reorganization.This reform started in the academic year 2008/2009.Since the SHIW surveys are held every two years and the next survey year after 2006 would have been 2008, when the reform came into force, we considered the most recent year prior to the reform.
PANOECONOMICUS, 2016, Vol.63, Issue 1, pp. 25-43 Moreover, we selected the sub-sample of employees by removing selfemployed workers since earnings for the latter are driven by more complex factors, such as taxation, and they are structurally different from employees.Furthermore, the sample is restricted to include those who are between 15 and 65 years old.The final data, for each of three years examined, has 7230 (for 1987), 6479 (for 1995) and 5934 (for 2006) observations (the table in the Appendix describes the variables we use in our analysis).
To obtain a picture of the unconditional wage dispersion for different ranges of schooling in southern Italy and in central-northern Italy, the 10 th , 25 th , 50 th , 75 th and 90 th quantiles of the log hourly wage are listed in Table 1.We also show, in this table, a measure of the wage dispersion, the 0.9-0.1 spread.The unconditional log hourly wage shows a wider gap at the bottom of distribution, which decreases as we consider the upper tail of distribution.Furthermore, differences in the quantiles between southern Italy and central-northern Italy are reduced for people with university and postgraduate qualifications.The spread 0.9-0.1, for the different range of schooling, is higher in southern Italy than in central-northern Italy, except for university and postgraduate education, where a greater spread is obtained for wages in centralnorthern Italy than in southern Italy (Figure 1).Source: Own calculations based on SHIW data.

Earning Equation and Rate of Return to Education across the Quantile Regression Model
As mentioned above, our aim is to estimate returns to education in southern Italy and in central-northern Italy: following the economic literature, we estimate separate Mincer equations for workers in these two areas.The model takes the following form: where y is hourly wages, age is a proxy for experience, and educ is a vector of dummies capturing five different education levels (elementary, middle, high, university and postgraduate).
Mincer's model produces unbiased estimators if the education variable is not influenced by other variables (it is exogenous).If we assume that the variable education depends on other variables (it is endogenous), as seems more realistic, we must use instrumental variables to obtain unbiased OLS estimators.A method to overcome this problem is the extended earnings function, which involves the replacement of the education variable (years of education) with a set of dummy variables corresponding to different degrees (or levels of education) obtained.The control variables include different variables that we report in the Appendix.
The Mincer equation is first estimated by using ordinary last squares (OLS), which focuses on mean effects.The quantile regression (Roger Koenker and Kevin F. Hallock 1978) is thus a natural extension of the OLS estimation of the conditional mean model and describes the conditional quantile regression as a linear function of observed heterogeneity, providing a detailed description of the conditional wage distribution.Specifically, we estimate quantile regression for the 10 th , 25 th , 50 th , 75 th and 90 th quantiles of the hourly wage distributions of each area, controlling for individual characteristics.
Estimates of Mincerian equations for southern Italy and central-northern Italy are reported in Table 2; in this table we report only coefficients for some control var-  .iables, if the interviewee is head of the household (cfdic) and if the person is a woman.Detailed results are available from the authors upon request.We report both OLS estimates and quantile regression estimates for the two macro-areas and for the three years considered.The estimated coefficients are mostly significant and the pseudo Rsquared value indicate a reasonable degree of fit of the Mincer specification given the cross-section nature of the data.Focusing only on returns of education, it may be observed that OLS estimates increase consistently with the education level, for both macro-areas and for all years considered, but they decrease over the years.Importantly, education has a greater weight in determining the wage premium for southern Italy than for northern-central Italy.These results should be taken with caution.In fact, OLS analysis estimates the relation between the mean value of the dependent variable (hourly wages) and variations in the explanatory variables.However, the marginal effects of changes in some of the variables in our model may not be equal across the whole distribution of hourly wages.In other words, the estimated coefficients may be a poor estimate of the relation between some of the explanatory variables and hourly wages, at different quantiles of its distribution.
Quantile regression is a useful way to overcome this problem, as it provides estimates of the regression coefficients at different quantiles of the dependent variable.Furthermore, two additional features of quantile regression fit our data better than traditional OLS.First, the classical properties of efficiency and minimum variance of the OLS estimator are obtained under the restrictive assumption of independently, identically and normally distributed error terms.When the error distribution deviates from normality, the quantile regression estimator may be more efficient PANOECONOMICUS, 2016, Vol.63, Issue 1, pp. 25-43 than that of OLS (Buchinsky 1998).Second, as the quantile regression estimator is derived by minimizing a weighted sum of absolute deviations, the parameter estimates are less sensitive to outliers and long tails in the data distribution.This makes the quantile regression estimator relatively robust to heteroskedasticity of the residuals.
The quantile regression results indicate the following: for the year 1987 returns to education are greater for southern Italy at least until the 50 th quantile, while from the 50 th quantile onwards education in northern-central Italy has higher returns, especially at the 75 th quantile.From 1995 to 2006 the impact of education in southern Italy decreased and lost significance in determining the salaries of employees in particular from the 50 th quantile onwards.We observe a different reality in northerncentral Italy, where returns to education tend to be more significant and crucial especially in determining wages from the 50 th quantile onwards.
In particular, we observe a dominance effect of the wages distribution of graduate workers in southern Italy on wages distribution of graduate workers in central and northern Italy for wages below the median (≤ 50 th quantile).Graduates in southern Italy perform better in terms of wages compared to graduates in central and northern Italy due to for unobserved factors within our empirical exercise.These unobserved factors are mainly related to the delay in students in southern Italy obtaining a degree compared to students in northern Italy.The reason for the large number of students graduating behind schedule lies in the institutional and organizational aspects of the Italian university system (Aina Carmen and Francesco Pastore 2012).In particular, the lack of a selective admission test facilitates the entry of unmotivated and low-skilled students in universities, whose abandonment rates are high.Hence much time is required to graduate.Moreover, the presence of very low tuition fees during the period established for graduation does not encourage students to respect the schedule to obtain as degree (Pietro Garibaldi et al. 2012).In addition, there are also negative signals coming from the labour market, especially in southern Italy, which create a kind of deterrent to graduation on schedule (Aina, Eliana Baici, and Giorgia Casalone 2011).
In particular, the specialization in the traditional sectors of Italian industry (manufacturing) could explain the low demand for graduate workers, the poor performance of tertiary education and the tendency of young people to reduce their effort, hence the delay in graduation.In other words, as university students believe that the returns to their degrees are low, they also believe that graduating on schedule is not necessary.In their study on the economic consequences of graduating beyond the legal time, Carmen and Pastore (2012) ascertain that the graduation delay has direct effects on wages, introducing wage penalties equal to 7% of the median wage.
If we compare returns to education in the three years considered, we may note a gradual reduction in the importance of education as a determinant of wage.The reduction in the importance of education is also emphasised by Istituto per lo Sviluppo della Formazione Professionale dei Lavoratori (2009) (Institute for the Development of Vocational Training of Workers), which reports that, during the period between 1993 and 2006, workers with high levels of education have been increasingly absorbed in jobs requiring low or medium qualifications.This has encouraged a decrease in returns to education and, hence, a tendency towards compression of inequality.
On observing the estimates for the coefficients of the control variables we find that women in both areas of the country earn less than their male counterparts, and gender discrimination, measured by the coefficient of the women dummy variable, is higher in lower earnings quantiles.The age earnings profile in both areas of the country is quadratic and the coefficient for the head of the household (cfdic) is always positive and significant, especially for central-northern Italy.The signs of the regressors are in line with previous studies on Italian wages (see Massimiliano Bratti and Stefano Staffolani 2001;Daniele Checchi 2001;Giuseppe Puggioni 2001;Staffolani and Alessandro Sterlacchini 2001;Gianni Boero et al. 2011).

Machado-Mata Decomposition: Wage Differential in Returns to Education between Southern Italy and Northern-Central Italy
In this section we examine the relative impact of returns to education on earnings changes by using stylised decomposition methodology.OLS and most statistical techniques focus on mean effects.They restrict the effect of the covariates to operate in the form of a simple "location shift" (Blaise Melly 2005a, b).The quantile regression model introduced by Koenker and Gilbert Bassett (1978) is more flexible than OLS and allows the effects of a covariate to be studied on the whole conditional distribution of the dependent variable (for a survey on empirical quantile regression see Koenker and Hallock 2001).
As in studies on sex, race or union wage differentials, the basic methodological approach is to estimate an earnings regression by using pooled data for southern and central-northern Italy employees and to include a dummy variable for worker's different geographical macro-areas.One problem with the dummy variable approach is that the returns to productivity-related characteristics and job attributes are constrained to be equal across areas.The effect of a worker's different geographical macro-areas (southern and central-northern Italy) is limited to an intercept effect.
An alternative approach involves estimating separate earnings functions for individuals in southern and central-northern Italy.The Blinder Oaxaca decomposition (1973) is a method for dividing the average earnings differential between two groups (or the average differential in the natural log of earnings, depending on the econometric model) into two components: the component that can be explained by differences in the average characteristics of group members (such as education) and the component that cannot be explained by such differences.A disadvantage of this approach is that it only focuses on differences at the mean of the two earnings distributions.If we consider only the mean of the regressors, we may miss some important factors explaining the difference between the two distributions.Machado and Mata (2005) propose an alternative decomposition procedure which combines quantile regression and a bootstrap approach.Following Melly (2005a, b), we can the decompose the difference in the th  quantile of the log hourly wage distribution between southern Italy's workers and central-northern Italy's workers as follows: PANOECONOMICUS, 2016, Vol.63, Issue 1, pp. 25-43 is the th  empirical quantile of the wage distribution, with j=south, centre-north and  is a random sample from the wage distribution that would have prevailed in central-northern area if all covariates had been distributed as in the southern area.For further insights, see Melly (2007).
The first term represents the contribution of coefficients and the second term represents the contribution of the covariates to the difference between the th  quantile of southern Italy workers' wage distribution and the th  quantile of centralnorthern workers' wage distribution.The residual term arises because the sample is randomly generated but it should asymptotically vanish.A detailed description of the technique and an analysis of its asymptotic properties are provided by James Albrecht, Anders Bjorklund, and Susan Vroman (2003), Albrecht, Aico van Vuuren, and Vroman (2009).
In Table 3 we present the results obtained from Equation (2) from the decomposition of the earnings gap between southern and central-northern employees for the selected quantile, using Melly's (2005a, b) estimator applied to quantile regression.Here the focus is the decomposition of the wage gap in an endowment and coefficients effect, in order to account for the presence of discrimination in the Italian labour market, looking at the different ranges of schooling (middle school, high school and university) separately (we omit the results for other educational ranges -elementary and postgraduate -due to the limited size of the database which could distort the results).
We consider three subsamples for the three different ranges of education in order to verify whether for a specific level of education there is a discrimination in terms of wage for southern Italian workers vis-à-vis workers in central-northern Italy.Looking at the results in Table 3, especially at the columns for 1987, it may be noted that, for workers with middle school at the 10 th quantile, the coefficients equal to -22% indicate that wages in southern Italy, for workers with middle school, are about 22% lower than wages in central-northern Italy (these coefficients can be interpreted as discrimination).This discrimination declines with the other quantiles.
If we look at the results for employees with higher levels of education (high school and university), discrimination between workers from the southern area and those from the centre-north area still emerges, but it is lower.In addition, for gradu- ate employees this discrimination is not significant.We find the same result for 1995 and 2006: greater discrimination is shown between southern and central-northern workers with a low level of education (middle school).The exception is represented by graduates employees in southern Italy for the years 1987 and 1995 to the 50 th quantile with a wage premium of about 7% compared to graduate employees in northern Italy.The characteristics component shows that southern workers with any level of education should earn more than central-northern workers at all of the wage distribution.In other words, specific characteristics (our covariates) of workers have a positive impact in determining the wages of southern workers than central-northern workers.The characteristics differential decreases as we move towards employees with higher levels of education.The relevance of the characteristics component tends to increase over the years, especially for the subsample of workers in the higher range of schooling.The raw difference -which is the sum of two effects: characteristics and coefficients -between southern and central-northern workers is particularly high for employees with a middle school qualification and is reduced for the remaining workers.It increases for all the years both for workers with middle and high school at least up to 50 th quantile.For graduate employees, this increase is only observed for 1995.
In summary, we note that the substantial wage gap between workers in southern Italy and workers in central-northern Italy is mainly due to lower education levels (middle and high school).The level of education determines the substantial disparities in terms of the wage return but this would only hold for education levels related to compulsory schooling.
The characteristics (or endowment) effect and the coefficient effect for education, generated by using the Machado-Mata procedure, are reported in Figure 2. Observation of the coefficient effects for all years for workers with a middle school qualification shows that the wage gap between southern and central-northern Italy decreases as we move towards higher quantiles.The effects characteristics is quite stable over the wage distribution and is close to zero in 1995 and 2006.For workers with high school diplomas we observe in 1987 that the coefficient effect is quite high for lower quantiles and decreases slightly up to the 20 th quantile and stabilises at around 8%.The wage gap tends to increase, especially in the lower quantiles, in 1995 and in 2006, and decreases exponentially in the highest quantiles.The effects characteristics is quite stable for all years considered.Finally, there is a fairly unstable trend for graduate employees, differing with each year analysed.For 1987 from the 20 th to 60 th quantile the wage premium for graduate employees in southern Italy decreases for higher quantiles and assumes negative values.For 1995 a similar less pronounced trend is observed, with a higher wage premium in the more extreme quantiles.Only in 2006 do we observe quite a stable trend around zero except in the distribution tails.Unlike the wage gap (or premium), the characteristic differential seems to be stable over the wage distribution and does not vary with the quantile.Its mean is about 0% for 1987 and 1995 and 0.05% for 2006.In Figure 3 we report only the effects coefficients.We observe for all years that the wage premium declines as we move up the income distribution for middle and high school but a greater decline is found for central-northern workers than southern workers.Only for graduate employees, especially for central-northern ones in 2006, do we observe the presence of such an evident wage premium growth for higher quantiles.This result is justified by the shorter time needed by northern Italian students to obtain their degrees compared to those from southern Italy.The spread between the two macro-areas widens especially during 2006, which resulted in an increase in the wage premium for graduates from northern Italy.Moreover, the lack of adequate job orientation and poor development of contractual instruments that allow suitable work experience and training during the years of education, are likely to produce a negative effect on the employment of southern Italian graduates and hence on their wages.Also, we should consider the rigidity factors that basically require interventions aimed at improving the efficiency of human capital (Floro Ernesto Caroleo 2012).In particular, a rigidity factor found on the labour market in southern Italy is the lack of demand for more educated workers due to the production structure, which is dominated by traditional manufacturing sectors with an intensive use of unskilled labour.

Figure 1
Figure 1 (0.9-0.1) Spread of the Log Hourly Wage for Different Years and Different Levels of Education for Southern Italy and Central-Northern Italy Control variables in the regressions are age, gender, head of the household, age cohorts, sector and occupation.Source: Own calculations based on SHIW data.

Figure 3
Figure 3 Wage Premium by Educational Attainment for Employees from Southern and Central-Northern Italy

Table 1
Quantiles of the Log Hourly Wage for Different Levels of Education in Southern and Central-Northern Italy Source: Own calculations based on SHIW data.The Impact of Education on Wage Determination between Workers in Southern and Central-Northern Italy PANOECONOMICUS, 2016, Vol.63, Issue 1, pp. 25-43

Table 2
Estimates of Mincer Equation: OLS and Quantile Regression

Year 1987 Coeff. Southern Italy Central-Northern Italy
Note: The table reports coefficient estimates of different quantile regressions; t-test statistics are reported in parenthesis.*** significant at 1%, **significant at 5% and * significant at 10%.Other control variables in the regressions are: age cohorts, sector and occupation.Source: Own calculations based on SHIW data.Note: The table reports coefficient estimates of different quantile regressions; t-test statistics are reported in parenthesis.*** significant at 1%, **significant at 5% and * significant at 10%.Other control variables in the regressions are: age cohorts, sector and occupation.Source: Own calculations based on SHIW data.The Impact of Education on Wage Determination between Workers in Southern and Central-Northern Italy PANOECONOMICUS, 2016, Vol.63, Issue 1, pp. 25-43 Year 2006

Table 3
Machado and Mata Decomposition of the South/Centre-North Employee Wage Differential for Different Educational Levels Note:The table reports coefficient estimates of different quantile regressions; t-test data are reported in parenthesis.***significant at 1%, **significant at 5% and * significant at 10%.Control variables in the regressions are age, gender, head of household, age cohorts, sector and occupation.Source: Own calculations based on SHIW data.