Individual differences in literary reading: Dimensions or categories

Literary text reading has long been a subject of empirical research. Various measures of reader differences and reader typologies were suggested, with the most prominent being studies of literary expertise, and studies employing Literary Response Questionnaire (LRQ; Miall & Kuiken, 1995). Literary expertise is difficult to define and fails to account for potential differences within non-experts. LRQ and similar dimensional approaches neglect the possibility that a salient reader typology does exist. The main goal of this study is to test whether a salient reader classification can be formed based on participant responses to questionnaires and to test how this classification corresponds to self-reported reader expertise. Based on responses from 741 participants (78.41% female, mean age = 24.31), we test the factor structure of LRQ in its Serbian translation and find moderate, acceptable fit. We also present our own Receptiveness to Literature Questionnaire (UPK) with two factors named Thorough Reading and Reading for Pleasure. Finally, we discuss relations between LRQ and UPK, offer classifications of readers formed on participant factor scores, and test the congruence between these classes and self-reported participant expertise. Our results indicate that a dimensional approach should be favored over forming categories of readers.


Differences between Readers
We can differentiate two approaches to studying individual differences between readers, both of which are also prominent in contemporary psychology. The typological, person-based approach (Asendorpf, 2002;Asendorpf et al., 2001;Robins, John, Caspi, Moffitt, & Stouthamer-Loeber, 1996) assumes the existence of discreet types within a population. For example, Hoffstaedter (1987) separated readers into three types based on preferred text complexity and interpretation habits. Other researchers created typologies based on reading history and habits (Dixon et al., 1993;van Rees, Vermunt, & Verboord, 1999). Although interesting, these and similar classifications usually remained confined to the very study in which they were presented.
The primary typological way in which differences between readers are considered is reader expertise. Numerous studies have separated participants into two groups based on some sort of formal training (e.g., Bortolussi & Dixon, 1996). Results ordinarily show that experts exhibit different response patterns, employ different strategies during reading, or that they simply enjoy reading more than novices (e.g., Dorfman, 1996;Earthman, 1992;Graves & Frederiksen, 1991;Hanauer, 1996;Janssen, Braaksma, Rijlaarsdam, & Van den Bergh, 2012;Peskin, 1998).
Despite its extensive use, expertise in literary reading is difficult to define. Graves (1996) discussed the issues of expertise operationalization, i.e., whether we should focus on generic or domain-specific expertise (cf. Warren, 2011), and which tasks are suitable to study expertise effects on literary reading, noting that options at each of these steps may affect study outcome. Moreover, we would argue that studying expertise is not a suitable replacement for studying differences between readers in general. The goal of expertise studies is the investigation of outstanding -not ordinary performance of experts (Ericsson, 2006), with novices primarily included as a control group, often situated on the other far end of distribution in skill (Vaci & Bilalić, 2017;Vaci, Gula, & Bilalić, 2014). Expertise therefore treats the majority simply as a non-minority: although expertise can tell us what characterizes experts, it cannot tell us much about nonexperts, who may not be a homogeneous group at all. Since empirical studies of literature should be concerned with "real readers" (Miall, 2006), and since most real readers are not experts, a criterion for distinction other than expertise should be employed.

Receptiveness to Literature Questionnaire
We find certain characteristics of LRQ potentially hindering -especially with respect to its length, complexity, and the fact that it focuses on individuals who read extensively and on fictional texts. Receptiveness to Literature Questionnaire (UPK) was designed to contrast LRQ in those regards and that intention served as a guideline throughout its construction. Note that this does not mean that UPK is an alternative to LRQ as the two questionnaires were developed differently and intended for different use. However, we believe there are circumstances in which LRQ may not be an adequate choice and UPK was designed to complement it.
First, we wanted to construct a questionnaire that can be discriminative on all levels of receptiveness. LRQ was intended to target readers "with a relatively well-developed conception of literature" (Miall & Kuiken, 1995, p. 3). Therefore, we assumed that LRQ may not be so successful at distinguishing finer-grained differences between individuals who do not like reading and do not read often. Our goal was to capture differences between individuals from all strata: beginning with non-readers, including non-experts, to expert readers.
Second, a receptiveness questionnaire should be applicable to all kinds of literary texts. Many LRQ items specifically relate to reading novels or short stories, and one of the more frequent words in those items is fiction. An entire factor named Story-Driven Reading focuses solely on enjoying narrative plots. Eva-Wood (2004) adapted LRQ into Poetry Response Questionnaire by changing item wording to refer to poetry and excluding Story-Driven Reading. UPK items were designed to eschew the need for such modifications by referring to the reading material only as a book or a text.
Finally, we wanted to build UPK "bottom-up". Miall and Kuiken (1995) relied on already published questionnaires and conceptions of reader behavior when constructing LRQ. Starting with a large collection of laden concepts may be the primary reason for the complex items and the complex structure that failed to replicate (cf. van Schooten, Oostdam, & de Glopper, 2001). Instead, we turned to readers themselves to report on their literary reading experience (Oljača & Nenadić, 2015). We gathered interview responses from a convenience sample of 29 participants with varying degrees of literary education and reading habits. Following a thematic analysis (Braun & Clarke, 2006), a 57 item questionnaire was designed. A participant sample of 302 participants (primarily undergraduate students) completed the pilot version of UPK alongside a number of other questionnaires. Upon exclusion of eight items, three factors were extracted: Thorough Reading (21 items; α = .92), Disinterest (16 items; α = .89), and Immersion (12 items; α = .92). These factors described two different approaches to reading, including differences in reading purpose, represented by Thorough Reading and Immersion factors, and a general lack of time or desire to read literary texts, represented by Disinterest. The questionnaire had good convergent and divergent validity, assessed by correlations with Read/Write, Reading, Gardening, Pets, and Exercise from Oregon Avocational Interest Scales (ORAIS; Goldberg, 2010), and IPIP versions of Aesthetic Appreciation (HEXACO; Ashton, Lee, & Goldberg, 2007), Artistic Interests (NEO-PI-R domains ;Costa & McCrae, 1992), and Absorption (MPQ; Tellegen, 2003). Finally, a MANOVA showed that experts in the sample (students of literature) scored higher than their peers on Thorough Reading only, lending further support to the construct.

The Present Study
The first goal of this study is to assess the adequacy of LRQ and UPK as instruments for measuring individual differences between readers. We also wanted to further shorten UPK and form its factor structure on a larger sample.
Although previous studies formed reader typologies or measured dimensions of reader characteristics, no studies tested whether differences between readers are continuous, or if readers should in fact be classified. It may seem that the dimensional approach is always more informative, but in many cases when multiple dimensions of individual differences are concerned classes may show interesting patterns in responses, which would otherwise remain unnoticed. In other words, classes may offer a more manageable and deeper understanding of differences between individuals. Therefore, the second and central goal of this study is to test whether a salient reader classification can be formed based on participant responses to LRQ and/or UPK.
Finally, we wanted to test the relations between UPK and LRQ participant scores, potential classes formed based on responses to the questionnaires, and self-reported participant expertise.

Method
Participants 807 participants from Serbia completed the questionnaires distributed online, in a snowball sampling procedure. Participants completed the questionnaires in their own time and received no compensation for participation. Upon exclusion of underage participants, duplicate entries, and outliers, 741 participants remained (78.41% female, M = 24.31, SD = 6.26). Most participants completed high school (54%), undergraduate studies (24%), or have obtained a master's degree (18%), while only a small number were yet to finish high-school (3%), or have already obtained a PhD (2%). Participants were also grouped as experts (24.16%) or non-experts, with females comprising the majority of experts (86.59%). The expert versus non-expert classification was made based on participant response to the question whether their studies or their profession is in the field of literature.

Instruments
Literary Response Questionnaire (LRQ; Miall & Kuiken, 1995). LRQ was translated to Serbian for the purposes of this study. LRQ consists of 68 five-point Likerttype items distributed in seven first-order factors: Insight (14 items), Empathy (7), Imagery Vividness (9), Leisure Escape (11), Concern with Author (10), Story-Driven Reading (8), and Rejection of Literary Values (9), with reliability indices ranging from .79 to .92, in our sample. Second-order structure consists of two components: (1) Experiencing (positive loadings of Insight, Empathy, Imagery Vividness, Leisure Escape, and Concern with Author) and (2) Literal Comprehension (positive loadings of Story-Driven Reading and Rejection of Literary Values and negative loading of Concern with Author).

Data Analysis
Factor analyses were conducted in FACTOR software, version 10.03.01 (Lorenzo-Seva & Ferrando, 2013). First, we rotated responses on LRQ using an oblique Procrustes Rotation (Browne, 1972), which tests the fit of the data to a theoretical solution. We assessed solution similarity to the original LRQ solution by using the Tucker's congruence coefficient (Lorenzo-Seva & ten Berge, 2006). The provided target matrix specified which items should have significant loading on the seven first-order factors of LRQ and whether the relation should be positive or negative. In addition, we performed an orthogonal Procrustes Rotation on participant factor scores for the seven first-order dimensions to test the second-order structure. In this case, following the procedure in the original study, Principal Components Analysis was employed.
Second, we performed a Minimum Rank Factor Analysis (ten Berge & Kiers, 1991) with Promin rotation (Lorenzo-Seva, 1999) on 49 UPK items. Optimal implementation of parallel analysis was the primary factor retention criterion (Timmerman & Lorenzo-Seva, 2011). Since one of our goals was to reduce the number of UPK items, factor loadings higher than .4 were considered relevant (see Yong & Pearce, 2013).
The remaining analyses were conducted in R statistical platform (R Core Team, 2015). We used package Hmisc (Harrell and contributions, 2016) for calculating variable correlations and package mclust (Fraley, Raftery, Murphy, & Scrucca, 2012) for latent class analysis or model-based clustering (Fraley et al., 2012) of participant factor scores on UPK and LRQ. Model-based clustering assumes that unobserved sub-populations (or latent classes) sharing patterns in responses to manifest variables exist within a studied population. Unlike other classification and clustering techniques, model-based clustering can offer a single-class solution as the best one, which is in favor of construct dimensionality, while extraction of multiple classes favors the typological approach. The model with the highest Bayesian Information Criterion value was considered optimal (Fraley et al., 2012). The validity of isolated classes was tested using discriminant analysis (Phillips & Lonigan, 2009). In this case, the dependent categorical variables were the isolated classes, while the independent variables were the scores on the questionnaire dimensions.
All the materials and analyses are available in more detail on the Open Science Framework platform (Foster & Deardorff, 2017), at osf.io/rb5nh.

Results
In order for two factors to be considered equal in a Procrustes rotation, Tucker's congruence index should equal .95 or higher. In the case of LRQ, the indices for five factors were higher than .85, indicating fair similarity, with an overall congruence index of .88 for the entire LRQ solution. Reliabilities of the extracted factors ranged from .84 to .93 (Table 1). Only three items did not have loadings of at least .3 on their respective factors. This prompted us to accept the solution and calculate participant scores on the seven first-order factors to be used in further analyses.
An orthogonal Procrustes Rotation Principal Components Analysis showed high congruence (.95) of second-order structure to the original. Table  1 shows Tucker's congruence indices next to the component labels, in addition to second-order component loadings. The first component, Experiencing, was completely replicated. However, Literal Comprehension has only fair similarity to the original solution. Reliabilities of second-order factors are either low in the case of Experiencing (.72) or very poor in the case of Literal Comprehension (.37). We also performed the analysis using the solution presented by van Schooten, Oostdam, and de Glopper (2001), but congruence indices were unsatisfactory. Following a Minimum Rank Factor Analysis of participant responses to UPK, thirty items were excluded due to extreme mean scores, high skewness and kurtosis, low communalities, and/or low factor loadings. The remaining 19 items loaded two factors, which together explain 50.4% of the total and 68.5% of the common variance (KMO = .91). The first factor was named Thorough Reading and it explained 29.3% of the common variance with a reliability index of .88. This factor consists of nine items, almost exclusively included in the Thorough Reading factor in the pilot study. The second factor includes a total of ten items stemming from both Immersion and Disinterest factors of the pilot study, and is here named Reading for Pleasure. This factor explains 39.2% of the common variance with a reliability estimate equaling .92. Factor loadings and item communalities of the retained items are given in the Appendix (Table A1), in our own translation to English.
We then tested the correlations within and between UPK and LRQ dimensions (Table A2). Thorough Reading and Reading for Pleasure correlate moderately (r = .50). Correlations of LRQ dimensions are also mostly significant and positive. Story-Driven Reading, however, does not correlate with Insight, Imagery Vividness, or Concern with Author. In turn, Rejection of Literary Values correlates negatively with other LRQ dimensions; except for a positive correlation with Story-Driven Reading. Relations between dimensions of UPK and LRQ are significant, with the exception of a non-significant correlation of Story-Driven Reading and Reading for Pleasure.
Next, we performed separate model-based clustering analyses for participant factor scores on seven first-order LRQ dimensions and two UPK dimensions. The optimal solution for LRQ dimensions includes two ellipsoidal classes with equal shape and orientation (BIC = -14820). The first class, named Alpha, included 176 (23.75%) participants who had higher than average scores on all dimensions, except for Story-Driven Reading (average scores), and Rejection of Literary Values (scores below average). The second class, named Beta, included 565 (76.25%) participants with slightly lower than average scores on all LRQ dimensions, except for Rejection of Literary Values (scores slightly above average). Factor scores on seven LRQ dimensions for both classes are shown in Figure 1.  As a secondary check of the validity of the cluster solutions, a discriminant analysis was conducted to determine whether cluster members would be correctly classified. For LRQ, the one canonical function was significant (λWilks = .58, χ 2 (7) = 404.3, p < .001, r canonical = .65), while the structure matrix suggests that this function was mainly saturated with Leisure Escape, Empathy, Imagery Vividness, and Insight dimensions. The discriminant function correctly classified 71.6% of Class Alpha and 93.5% of Class Beta members. The overall classification rate was 88.3%; using leave-one-out analyses, the rate was comparable at 88.1%. For UPK, two canonical functions were significant. The first function (λ Wilks = .22, χ 2 (6) = 1128.3, p < .001, r canonical = .88) was saturated positively with Thorough Reading (r = .86) and Interest (r = .68). The second function (λ Wilks = .99, χ 2 (6) = 10.2, p < .01, r canonical = .88) was saturated negatively with Thorough Reading (r = -.54), but positively with Interest (r = .74). The two discriminant functions correctly classified 86.7% of Class A, 97.1% of Class B, 84.7% of Class C, and 0.0% of Class D members. All Class D members were classified as Class C members. The overall classification rate was 87.4%; using leave-one-out analysis, the rate was comparable at 87.3%. Agreement in classifications obtained for UPK and LRQ can be considered moderate for participants with low scores (Table 2). Participants belonging to class D in UPK classification almost without exception belonged to class Beta in LRQ classification. Similarly, most participants from class C are in class Beta as well. However, participants with higher scores on UPK dimensions (classes A and B) have equal chance of being classified as either Alpha or Beta in LRQ classification. Finally, we assessed relations of self-reported expertise and participant performance on questionnaires. Although a series of Mann-Whitney U tests initially showed that experts have higher scores on LRQ dimensions Insight (W = 41574, p < .01), Empathy (W = 42153, p = .01), Imagery Vividness (W = 38709, p < .01), Concern with Author (W = 36800, p < .01), and UPK dimension Thorough Reading (W = 35905, p < .01), while non-experts have higher scores on LRQ dimensions Story-Driven Reading (W = 58980, p < .01) and Rejection of Literary Values (W = 59322, p < .01), self-reported expertise did not correspond to classifications provided by model-based clustering on neither UPK nor LRQ participant scores. Even though classes with the highest dimension scores (class A for UPK and class Alpha for LRQ) include more experts relative to the total number of participants in a class, this congruence is not sufficiently high (Table 3).

Discussion
Results indicate an acceptable level of internal validity of LRQ and a relatively stable factor-structure, especially considering that it was administered twenty years past its original publication, in another language, and to participants somewhat different from the original sample. First-order factors are similar to those presented in the original study. Second-order structure fails to replicate, as was shown in our Procrustes Rotation factor analysis, which was an issue first raised by van Schooten, Oostdam, and de Glopper (2001) by using confirmatory factor analysis. Importantly, the second-order structure failed to replicate on a sample predominantly consisting of students, i.e., more similar to the one used by Miall & Kuiken (1995) than van Schooten et al.'s (2001) participant sample. Therefore, first-order structure use seems justified, but we suggest abandoning second-order factor structure completely. We would also call for retesting and potential adjustment of LRQ on a new sample of English readers.
Although additional studies are required to offer confirmation of factor structure and internal validity of the UPK, it seems to be an adequate, valid, and reliable measure of receptiveness to literary texts. The new short version of UPK makes a distinction between two dimensions of reading -Reading for Pleasure and Thorough Reading. Reading for Pleasure encompasses items that express immersion in the text, eagerness to read, enjoyment during reading, and being able to read with interest for extended periods of time without getting bored or fatigued. Thorough Reading maintained much the same structure as in the pilot study, including items that relate to artistic language use and a tendency to be intellectually engaged during or after reading.
UPK has a simple internal structure and a small number of simple items, but it also complements LRQ in the sense that it does not target specific forms of reader response, instead targeting the general affinity for reading and receptiveness to literary art. In other words, the crucial distinction between the two questionnaires is that UPK targets two approaches to reading -one in which a reader is fully intellectually engaged with the text, with focus on language use or artistic devices (Thorough Reading), and another in which the reader's main goal is simple enjoyment or pastime (Reading for Pleasure). LRQ, on the other hand, describes in more detail different responses during reading (e.g., their Empathy or Imagery Vividness). This distinction is also noted in the correlations between dimensions of the two questionnaires, which are expectedly weak to moderate, implying that they tap into different (yet not entirely independent) aspects of reader characteristics.
Model-based clustering of first-order LRQ factors extracted two classes of readers. Class Alpha included readers with increased scores on all dimensions of LRQ except for average scores on Story-Driven Reading and low scores on Rejection of Literary Values. Class Beta included readers who also have average scores on Story-Driven Reading, but have slightly or moderately lower scores on all other LRQ dimensions, except for Rejection of Literary Values, where a slight increase is registered. UPK factor scores produced four classes with a stable linear drop of factor scores going from one class to another. A discriminant analysis was conducted to test the validity of these classifications, and we registered high congruence in the case of LRQ and almost perfect congruence in the case of UPK, with the exception of misclassification of Class D. Although these results can be interpreted in favor of typological approaches, we believe they actually imply a continuous nature of the constructs. Whenever a transition from one profile to another consists solely of lowering of scores on all dimensions, the only implication of classification are somewhat more visible "group" boundaries on a clearly dimensional continuum. In those cases, there is little reason to rely on the less informative classification, as the classes show the same relative patterns of dimension scores. Therefore, for both LRQ and UPK, we suggest that researchers primarily employ a dimensional approach, rather than classify readers. PSIHOLOGIJA, 2019, Vol. 52(2), 197-215 Finally, not only are LRQ and UPK measuring different aspects of differences between readers, but these measures are separate from self-reported expertise as well. Even though classes A (UPK) and Alpha (LRQ) include the highest number of experts, we find little congruence between self-reported expertise and extracted classes based on either of the two questionnaires. Twothirds of classes A and Alpha are participants that are neither students nor professionals in the field of literature, and although their number drops, there are experts even in the classes with the lowest scores, Beta and D.
Two possible explanations can be offered for this finding. First, perhaps self-reported expertise cannot be considered a valid estimate of a person's actual expertise, especially considering that many of the experts were still students. Such a possibility is a reminder that expertise can be hard to pin-point and define (Graves, 1996), and is also a warning for researchers to be careful when conducting expertise studies based on a reader's study major or year of study.
Second, perhaps LRQ and UPK cannot reliably differentiate between experts and non-experts (Mann Whitney U tests detected differences between experts and non-experts on certain dimensions, but these differences are small). This then becomes an interesting question, as the results imply that there are multiple viable uncorrelated criteria to differentiate readers, which means that different constructs of relevant reader characteristics could in turn have different relations with everyday reading behavior. There is a lack of a strong relation of expertise to LRQ classes Alpha and Beta. Therefore, experts' response to a text might not be so different from non-experts' -such that the two groups may equally think about what they are reading or form equally vivid images of the story setting. The construct of expertise does not seem to correspond to receptiveness to literature as measured by UPK either. Future studies should attempt to define what expertise entails; for example, experts may not read more than non-experts or have a more intense feeling of immersion while reading, however, they may have a larger knowledge base about the text or draw different conclusions from it. As for now expertise does not seem to include a more emphasized literary response nor being more receptive to literary texts, at least when methods of self-rating are concerned.
It is only fair to point out that the participant sample in this study predominantly consisted of young females, and that the self-reported experts were also predominantly female, which may have affected the results. Although Miall and Kuiken (1995) note gender differences only on the dimension of Leisure Escape, van Schooten, and de Glopper (2006) detect such differences in favor of females on multiple LRQ dimensions. We consider our sample size too small to make comparisons pertaining to gender differences on LRQ and UPK dimensions, but we believe that sample (gender) structure is an important aspect to investigate in future studies.
We strongly encourage further investigation of what scores on LRQ and UPK dimensions and reader expertise can predict when it comes to processing texts and reading outcomes, and also importantly, what they cannot. Following such studies, we could better describe relevant reader characteristics and how they influence reading, develop more detailed hypotheses based on sample characteristics, and finally, use these findings in a school setting, allowing teachers to mold their lectures to fit their class or assess course effects.

Conclusion
Our research used Literary Response Questionnaire (LRQ) and our own Receptiveness to Literature Questionnaire (UPK) to test whether salient classes of readers can be extracted, and how such classes relate to self-reported expertise. LRQ showed a relatively stable first-order structure, however, secondorder structure was not replicated. We suggest a new evaluation of LRQ using an English-speaking sample. Although classes of readers were extracted for both LRQ and UPK, the stable drop in scores for all dimensions seems to favor the dimensional, rather than the typological approach. Neither classification corresponded to self-reported participant expertise.