Questions about Household Consumption in Surveys

Household total expenditure (consumption) is a very important phenomenon in many research areas. The problem is how to get precise information about the consumption from each household and at the same time not to make the questionnaire so long and involved that it becomes a burden to the respondent. In this paper is evidence from several sources on the usefulness of recall consumption questions. Valid information can be collected by adding specific recall questions to general purpose surveys. There are a few recommendations on how to do so.


Introduction
Household consumption is very difficult variable to evaluate on the basis of the questions in the surveys.The main goal of this paper is to represent different methods that can be used to recover a measure of total household expenditure in general purpose surveys.Experiences from countries with well developed national survey systems could be of great help for further improvement in that field.
The focus throughout this paper will be on empirical research on high income countries.Closely related measurement issues arise in middle income and low income countries, but there are also significant differences (for example, the importance of small farms and small businesses, the lack of income data etc.).In many developing countries, including Serbia, the LSMS (Living Standard Measurement Study) program is implemented (http://www.worldbank.org/LSMS/).The LSMS have become an important tool in measuring and understanding household consumption through household surveys.In this paper, the focus is on the experience of two developed countries, the Canadian out of Employment Panel (COEP) and the Italian Survey on Household Income and Wealth (SHIW).

The Significance of Information on Household Total Expenditure
Household total expenditure (consumption) is very important phenomenon in many research areas.It is desirable to have information on household total expenditure ('consumption').A partial list of the uses for such data includes: a) Tracking changes in the distribution of material living standards over time.There is a great deal of interest in how the distribution of material well-being evolves over time (for example, is poverty increasing or decreasing?).Consumption (the purchase of non-durables and flow of services from the stock of durables) is probably the best direct measure of material well-being and is the focus of a number of recent studies.b) The impact of environmental shocks or policy changes on the material well-being of different households.Examples include the impact of retirement on material well-being (the so-called 'consumption retirement puzzle'); the role of Unemployment Insurance benefits in job search and maintaining short run living standards; the adequacy of Social Assistance; the long run cost of job loss and tests of full insurance.c) Consumption and saving research.There are still far more unanswered questions in consumption and saving research than there are settled issues.For example: the importance of the precautionary motive; the reaction of household expenditure to temporary and permanent tax changes; the importance of the retirement motive in saving; and the role of durables in smoothing mechanisms etc. d) The use of consumption as a conditioning variable in life cycle models.
Under some circumstances current consumption can be taken as a 'sufficient statistic' for expectations and unobservable wealth in models of life cycle decisions such as labor supply; human capital formation; and fertility.This is potentially very useful not only at a cross-section level but also over time: changes in consumption may signal (unobservable) changes in current circumstances or expectations.Given these different research needs, there are a number of options.These are: a) Use aggregate time series data.Given the ATS we currently have (means of levels of expenditures on different categories of goods), this is only useful under very restrictive circumstances.There are severe limits on what we can learn from studies based solely on ATS.b) Use a proxy for consumption.Income is the most widely suggested.Baxter and Jermann (1999) have found evidence that consumption growth is excessively sensitive to predictable changes in income.For example, for measures of inequality researchers often use income.This is problematic if income exhibits transitory fluctuations which most (but not all!) households can smooth.Many researchers use measures of expenditures on food as proxies for total consumption.c) Run a diary based survey.This is costly and can usually be justified only by the need for central statistical offices to calculate weights for consumer price indices.Additionally, recording such data is hard and time consuming, which makes it difficult to gather much other information on the households taking part.Time series of family expenditure surveys can provide valuable information.For example, using such data it is possible to map out the evolution of the distribution of consumption over time.However, (1) many countries conduct budget studies to reweight their consumer price series only on an irregular basis, (2) the lack, in such surveys, of information on items other than expenditures can mean that while the evolution of the distribution of expenditure can be tracked, it is more difficult to isolate the sources of change, and (3) the interpretation of the distribution of expenditure levels depends on assumptions about the nature of intertemporal allocation, about preferences, and about credit and insurance markets, which are difficult or impossible to test with cross-section data.Time series of family expenditure surveys can also be used to construct quasipanels that have been extensively used in the consumption research literature.There are, however, significant limitations to what can be done with a quasi-panel.It is not possible to analyze convincingly many dynamic situations with quasi-panel data.The obvious way to overcome many of these problems is to collect a diary based panel.It is generally felt that this is not possible for long periods (more than four quarters) because of the respondent burden.The experience in Spain with the EBFS suggests that the pessimism here may be exaggerated.The EBFS is a nationally representative expenditure survey.It collected diary based consumption information for a large group of households for 24 periods in the 1970's and collected a rolling panel in which households participated for eight periods in the 1980's and 1990's.d) Use panel data on wealth and income to evaluate total expenditure.In principal, the intertemporal budget constraint (income minus consumption equals the change in wealth) implies that consumption can be evaluated from panel data on income and wealth.In practice, wealth holdings are usually very noisy -and first differencing makes it very difficult to extract the expenditure 'signal' in the data.This approach might be more successful with administrative data in countries in which wealth is recorded by the government.e) Ask retrospective questions on consumption and expenditures.This is the focus of the present paper.The practice is actually more widespread than is usually thought.Some national expenditure surveys are partially or wholly based on retrospective questions.For example, the US CEX rolling quarterly panel is based on interview recall questions and the Canadian FAMEX collects household information on annual calendar year expenditures.More commonly, expenditure information on durables and clothing and less frequently purchased items is based on retrospective questions coupled with diaries for day-to-day expenditures.

Questions about total household expenditure
It is unlikely that we will ever be able to design a set of questions short enough to be included in most surveys and also comprehensive enough to meet most research needs.Nonetheless, the inclusion of some consumption questions in general purpose surveys (be they cross-section or longitudinal) is potentially very valuable.We also, however, need to have an eye to designing consumption questions for more tightly focused surveys which are partly designed to answer research questions in the consumption and saving area.As we shall see, the design criteria differ between the two contexts.It is also important to be aware of the psychology of survey response.
In the next section the feasibility of asking a single broad 'total expenditure' question is discussed.This has obvious attractions if we are interested in total expenditure but, there are significant problems.The other two methods that can be used to recover total expenditures are based on asking questions concerning expenditures on sub-items of the total, such as food at home, clothing, utilities etc.In the first method respondents are asked about an exhaustive range of items and in the second about a selected subset of the total list of sub-items.Since 'food at home' questions are used in both methods and are also widely used, here is a detailed analysis of the experience of asking about food at home.Furthermore, there is the analysis of the experience of asking questions concerning expenditures on an exhaustive list of consumption items.That is, asking for expenditures on all of the components of total expenditures and taking the sum of these to be total expenditure.The issues here are which items to choose and the appropriate level of disaggregation.After that, there is the chapter about asking questions on selected items (including 'food at home') and then using this partial information to impute the total is discussed.In the final section some recommendations based on the foregoing analysis are presented.
If we are interested in the total expenditure of a household in a given period then one, superficially attractive procedure is to simply ask respondents how much this is.As most readers will readily believe, it will not elicit a very accurate answer (how much did you spend last month on everything?).Because of that, a few surveys contain broad consumption questions.
The total expenditure question in the COEP followed the questions concerning individual items.The exact form of the question was: About how much did you and your household spend on everything in the past month?Please think about all bills such as rent, mortgage, loan payments, utility and other bills, as well as all expenses such as food, clothing, transportation, entertainment and any other expenses you and your household may have.
This was a first attempt at asking a total expenditure question.Experience suggests a number of problems with the question in this form.First, the time period should have been specified more precisely.Maybe is better to ask about expenditures in the last calendar month.More importantly, there are also significant problems with the cues (the list of expenditure items that respondents are asked to think about).First, the cues include 'loan repayments' which is clearly a saving item and not a consumption item!It would also be useful to explicitly exclude insurance payments.Second, the cues do not mention durables and this seems to have caused problems for an analysis of responses.In retrospect it would have been better to have explicitly excluded purchases of durables and to have asked about this separately.It would be best to exclude housing expenditures and to ask about these separately.Another problem with the set of cues given above is that they were designed for one specific population (unemployed Canadian workers) and may not be appropriate in other contexts.For example, if one were sampling old people then one would want to include out-ofpocket medical expenses in the cues.The study of young people might explicitly mention items such as schooling or child care expenses.In general, it is better to tailor the list of cues to the target population.
The Bank of Italy Survey on Household Income and Wealth (SHIW) is a representative sample of the Italian population (even though response rates in recent waves are in the 40-70% range), with around 7000 participating households in every wave.SHIW asks respondents a very broad range of questions including one on their average monthly expenditure on all items except for a few listed durable goods and another on monthly expenditure on food alone.
One issue that is often raised in this context is the item response rate for such questions.There seems to be a pervasive view that recall expenditure questions are more difficult to answer than the recall income and earnings questions which are commonly asked in general surveys.The experience with the COEP calls this view into question.Moreover, the item non-response observed in the COEP is not anomalous.
The second point is that the difficulty that respondents have in answering such questions varies in important ways with characteristics of the respondent and her or his household.There is less item non-response to the total expenditure question when the respondent is the head (primary earner) in the household and much more non-response when the respondent lives in a 'composite household' (that is households which comprise individuals other than either a single person, a couple or a couple and their children).This fact will obviously be important for survey designers to consider.Are you surveying households or individuals?Are there many composite households in your populations (as there are for example in Italy and Japan) or rather fewer?
A second issue to consider with respect to recall total expenditure questions is that they display considerable heaping and rounding.Since this is a familiar problem and there are well established ways of dealing with it, for the analysis this presents relatively minor problem.

Questions about food at home
A number of surveys contain a question on expenditure on 'food at home'.This is potentially very useful information for imputing total expenditure.On the other hand, Attanasio and Weber (1994) are stressing that food consumption is unsuitable because preferences are not separable between food and other nondurables.
This section represents a detailed discussion of the findings for food at home questions.One source of information on the reliability of survey questions on food at home is the U.S. Consumer Expenditure Survey (CEX).This survey has both an interview and a diary component.Specifically, the CEX consists of two separate samples: one is a rotating panel (following households for four quarters) and the other is a cross section.The panel sample households are asked in each interview recall questions on their consumption of a large number of items over the previous month and quarter.The cross section sample households instead fill in a detailed diary on expenditure on a number of non-durable goods and services.The two samples are independent draws from the same population and the sample design is common.
In the CEX interview sample (1980)(1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)) a food at home question has always been asked, but the exact wording of the question has changed twice.In 1980-81 there were three questions that can be used to infer food expenditures.First the respondent was asked about expenditures on usual weekly expenses at the grocery store or supermarket.Then they are asked about how much of this is not for food.Finally, the respondent is asked about food purchased in other places (such as bakers).This information is then used to construct the 'food at home' measure.In 1982-87 the question was changed to how often and how much was spent in food over the previous month.In 1988 the 1980-81 question was resumed.Average spending on food appears to be heavily affected by the structure of the questions.This is of obvious importance for anyone designing recall expenditure questions.
We can also use the Canadian and Italian surveys to look at this issue.Turning first to the Italian surveys, BMW shows that food expenditure data are of comparable quality and informational content across the two surveys, once heaping, rounding and time averaging are properly accounted for.Turning to the Canadian data, once again the recall and expenditure survey questions are close in central measure and not too different in dispersion.
The general conclusion from this analysis is that respondents seem to do a remarkably good job of reporting their household's expenditures on 'food at home'.This is in contrast to the experience with the 'total' expenditure questions.

The questions on all of the sub-items
One procedure that is sometimes used to recover total expenditure in a survey is to ask a series of questions on all of the sub-items of the total.For example, we could ask for total non-durables, durables and housing but usually we think of the exhaustive list being more detailed than this.Thus the Japanese Panel Survey on Consumers (JPSC) asks about 15 distinct expenditure items that cover all expenditures and the Holland VSB asked about 35 different nondurable items.On the other hand, Larsen (2002) is stressing that total expenditure is an unweighted sum of expenditures that contain measurement errors.
There are two closely related aspects to asking an exhaustive list.The first is what is the correct level of disaggregation?The second question is how accurately are the sub-items reported?Pradhan (2001) presents evidence on the first issue based on the Indonesian national socioeconomic survey (Susenas).This consists of a core questionnaire and a module questionnaire.The core questionnaire is administered to the whole sample (over 200,000 households), the module questionnaire to a large sub sample (about a third of the total).The module questionnaire contains expenditure and self-consumption records on 218 items; the core questionnaire has records on 15 broad commodities.The module consumption items can be directly aggregated into these 15 commodities, but the same household is never asked to provide both detailed and aggregate measures for them.Comparing the two samples, Pradhan finds that the (aggregated) core questionnaire underestimates total consumption relative to the (disaggregated) module by between 11.7% and 19.6%.In line with the results of the previous section, food expenditures are less severely underestimated in the core questionnaire; conversely, non-food consumption is on average at least 23.8% lower than in module data.Looking at the 15 broad categories, Pradhan reports that there is negative underestimation for most goods with the worst being durable goods (-46%), housing and utilities (-31%) and miscellaneous goods and services (-53%).He also reports overestimation for some goods: education (+28%), alcohol (+83%) and tobacco (+9%).The evidence does not change much if we consider a yearly rather than monthly recall period.Pradhan (2001) also finds that the reporting differences are correlated with the level of total expenditure.He summarizes his results as follows: "using a high level of aggregation yields a lower consumption measure and the fraction of underestimation increases as consumption rises".
It seems that although the 'exhaustive list' method is widely used, it is quite demanding in terms of interview time.The evidence given above suggests that we need to ask about a quite detailed list of items and some of these may be reported with substantial error.Given this, we might reasonably ask if it would not be better to drop the noisy questions altogether and to concentrate on using information on a non-exhaustive list of items which are thought to be better measured.

The questions about selected sub-items
In this section we address the following questions: if one can only ask questions about some sub-components of total expenditure, which components should one choose?And how should the responses be used to construct a measure of "total" expenditure?These issues are important because recall total questions, while containing valid variance and being suitable for some uses, are nonetheless subject to flaws (such as underreporting) that make them unsuitable for other uses (for example, constructing savings measures).In using this partial information, we shall suppose that we have available an associated expenditure survey which gives reliable information on all goods.
We shall be interested in how we could use information on a non-exhaustive list of expenditure items in imputing total non-durable expenditure.This raises a number of issues: Which subset of goods to choose for the analysis?How should we choose the weights?How can we allow for the fact that the individual expenditure items are measured with considerable noise?
Although we do not rule out the possibility of making these choices optimally, we will use the very simple scheme that follows the line developed by Skinner (1987).In this scheme we first choose a subset of goods 1, 2...k and then we run the regression on expenditure survey data: (1) where: x -total expenditure, x 1 , x 2 , ... Denote the OLS estimates by πˆj .This gives us weights to use in predicting on the non-expenditure survey that we are interested in: (2) Browning et al. (2002) are recommending the expenditure items for this procedure that are believed to be well measured by recall questions.For the research purposes they only consider 'food at home', 'food outside the home' and expenditures that are regularly billed.For the latter they take 'phones' and 'utilities' (a composite of water, fuel and electricity).This choice is the result of some prior analysis of the data but they do not rule out that a more systematic analysis would give an improvement on these items.The income is not used as a predictor even though it is surely a good one.There are two reasons for this.First, in many expenditure surveys income is not well measured.Second, the use of income to impute expenditure introduces spurious relationships between income and the imputed measure which invalidate some uses of the imputed measure (for example, testing for excess sensitivity).Browning et al. (2002) explore this issue in Canadian data (using the 1996 FAMEX), in Italian data (using the SFB) and in the Spanish ECPF which has a panel aspect that allows us to take annual differences.They report the results of five experiments, for each report coefficient estimates, the R 2 for the regression, and also the R 2 for the fit of the estimated model on a sub-sample of 25% of households that were randomly held back from the estimation.The latter provides a test for 'overfitting' in the original regression.
The results of regressing total non-durable expenditure on 'food at home' and 'food outside the home' are that these two 'predictors' 'explain' 56% and 67% of the variance of non-durable expenditure in the Italian and Canadian data respectively.Thus the food categories 'explain' a good deal of the variance of total non-durables.One important aspect of this is that they include a constant; Skinner finds a lower R 2 for total food on the US CEX (only 26%), but this is without including a constant.When they add two utilities categories, the R 2 is rising to 63% and 74% respectively.They also indicate that adding demographics leads to a small increase in explanatory power.
These results suggest that imputing the total from the sub-items we can 'explain' a substantial proportion of the total variability.The evidence presented here concerns cross-section variability.

Conclusions
Various methods that can be used to recover a measure of total household expenditure in general purpose surveys are presented.There is rather more information about asking expenditure questions than is sometimes thought and the various surveys that do it provide some guidance as to future possibilities.Some of recommendations are common to any survey question.For example, it is no use asking a question about something that the respondent does not know much about.This suggests that it may be worth asking specific questions on how well informed the respondent is about household matters such as household expenditures, household income and family links.Equally obviously, the specific form of the question can make a big difference to responses: extensive pretesting is always recommended for non-standard questions.
In general, the most accurate recall based measure of total expenditure will be derived from asking about an exhaustive list of highly disaggregated expenditure items.This is, however, a counsel of perfection that few general purpose surveys could afford.Given that this is not feasible, we suggest the following.First, always ask a 'food at home' question.It seems that respondents can report this accurately and that being a large budget item, it is very useful in imputation.Second, always ask a 'food outside the home' question.Although there is no convincing evidence on the accuracy of a recall question on this, it is a useful complement to the 'food at home' question.This is because the two items are obvious substitutes and there is a great deal of heterogeneity in the two food budget shares for households that have the same level of total expenditure.Thus the two measures together give a better predictor for the total.
The analysis presented in the last section also suggests that it is worth collecting information on utilities and telephones.One warning note here is that the utilities expenditure information used in our imputation analysis is typically validated by the interviewer seeing bills and noting the specific amounts and time periods.It is not clear that a simple question such as "how much did you spend on water, fuel and electricity in the last calendar month" will elicit accurate information.
The analysis in last section suggests that asking for just a few sub-items of expenditures recovers a reasonable amount of the information needed to impute nondurable consumption accurately.However, one concern with this strategy used alone is that there is a great deal of heterogeneity in expenditure patterns and some budget items can sometimes be idiosyncratically large.For example, it may be that a particular household is very keen on horse riding and spends half of its total expenditure on that.Clearly the list above would lead us to dramatically underestimate total expenditure for such a household.Consequently, we suggest supplementing the sub-item list above with a 'total nondurable' expenditure question.At present it is not clear how to optimally combine these two sets of information and the supplementary evidence presented above on bias.Nevertheless, the analysis recalling questions about total expenditure can generate reasonable response rates.It is likely that the total expenditure question does give some genuine extra and valuable information over and above the responses on the sub-items.
In the introduction we outlined the advantages of having total expenditure information in general purpose surveys.Many researchers are pessimistic about the possibility of recovering such information without expensive diaries or long lists of recall expenditure questions.It is pessimistic point of view.It is possible to elicit a great deal of useful information on expenditures and as time goes on we shall discover better ways of using this information.