Psychometric evaluation of the Serbian dictionary for automatic text analysis – LIWCser

LIWC (Linguistic Inquiry and Word Count) is widely used word-level content analysis software. It was used in large number of studies in the fields of clinical, social and personality psychology, and it is adapted for text analysis in 11 world languages. The aim of this research was to validate empirically newly constructed adaptation of LIWC software for Serbian language (LIWCser). The sample of the texts consisted of 384 texts in Serbian and 141 texts in English. It included scientific paper abstracts, newspaper articles, movie subtitles, short stories and essays. Comparative analysis of Serbian and English version of the software demonstrated acceptable level of equivalence (ICCM=.70). Average coverage of the texts with LIWCser dictionary was 69.93%, and variability of this measure in different types of texts is in line with expected. Adaptation of LIWC software for Serbian opens entirely new possibilities of assessment of spontaneous verbal behaviour that is highly relevant for different fields of psychology.

There is a consensus among the authors that words we use map our mental, social and physical states (Frojd, 1969;Tausczik & Pennebaker, 2010).During the history of psychology a number of prominent researchers, pointed out the importance of studying the ways people naturally talk in the real world (e.g., Bradac, 1986;Gottschalk & Gleser, 1969;Gottschalk, Gleser, Daniels, & Block, 1958).Although this idea exists in psychology for more than a century, researchers recently started to systematically investigate relationship between psychological constructs, on one side, and content and style of verbal behaviour, on the other (Hirsh & Peterson, 2009;Mehl, Gosling, & Pennebaker, 2006).
Several distinct approaches in the quantitative analysis have been developed, i.e., thematic text analysis, and automatic text analysis -ATA (Pennebaker et al., 2003).ATA has emerged from the development of artificial intelligence and focuses on the frequency (i.e., intensity) of thematic and/or stylistic characteristics of the text (Shapiro & Markoff, 1997;Pennebaker et al., 2003).Methodologically, it has several advantages.First, since computer software analyses data, it provides results that are more objective and replicable, compared to manual coding.Second, measurement error (that usually results from individual differences between raters) is minimal and it allows methodological equivalence of different studies.Finally, these data do not share method variance with the data obtained with other assessment methods that researchers frequently use in psychology (Mehl & Gill, 2010).
It is possible to differentiate two relatively distinct methodological approaches within the ATA.First approach, Word pattern analysis, based on complex algorithms, detects how meaning conveyers (words and word phrases) group in large text samples (Wolfe, Schreiner, Rehder, Laham, Foltz, Kintsch, & Landauer, 1998).For example, Latent Semantic Analysis (LSA) enables researchers to determine similarity of the texts based on latent structure of the meaning in the analyzed verbal product -i.e., it is concerned with the use of words in a specific context (Landauer, Foltz, & Laham, 1998).Second approach, Word count strategies, focuses on a single word analyses in order to extract both content and style properties of the text.Basic assumption is that individual differences in the frequency of use of specific words or word groups reflect individual differences in feelings, attitudes, and cognition (Pennebaker et al., 2003).Therefore, software designed to perform single word analysis focuses on word counting, according to predefined (grammatical or semantic) word categories.

Software for the Automatic Text Analysis
In the beginning, researchers used ATA dominantly in the field of clinical psychology but the focus has broadened to other fields, e.g., social, occupational, and psychology of individual differences (Pennebaker et al., 2003).With respect to that, several software for the ATA have been developed during the years, e.g., The General Inquirer (Stone, Dunphy, Smith, & Ogilvie, 1966), TAS/C (Mergenthaler, 1996), and DICTION (Hart, 1984;2001) (for the overview see Bjekić, Lazarević, Erić, Stojimirović, & Đokić, 2012;Pennebaker et al., 2003;Lowe, 2003).
The authors of the most recent software, Linguistic Inquiry and Word Count (LIWC) constructed it to overcome issues related to judges' ratings in emotional writing assessment (Tausczik & Pennebaker, 2010).Word-count approach is a basis of LIWC and therefore, this software performs successive text analysis with a single word as unit of analysis.It compares grapheme patterns of each unit in the input text with the patterns in the dictionary incorporated into the software (Pennebaker, Chung, Ireland, Gonzales, & Booth, 2007).LIWC dictionary consists of a large number of grapheme patterns (words or word stems 1 ) classified into categories, where single pattern can belong to one or several categories.Based on the number of patterns detected, software provides information about the share of each predefined category in the analyzed text.The content of the dictionary and software properties evolved over time -since the first attempt of construction in early '90 to the today's version in 2007 (for details about the process see Pennebaker, Chung, Ireland, Gonzales, & Booth, 2007).
English LIWC2007 dictionary consists of about 4500 word stems, classified into 63 categories, which are relevant to various aspects of human cognitive, emotional, social, and physical functioning (Pennebaker et al., 2007).Authors organized these categories into four groups (Pennebaker et al., 2007) 2 .First group includes various Linguistic processes, e.g., verbs, auxiliary verbs, pronouns, adverbs, prepositions, etc., and other categories consisting of words manifesting the way something is said (e.g., negations, quantifiers, informal words, etc.).In the second group, authors included 32 hierarchically organised Psychological categories, created specifically for psychological researches (Pennebaker et al., 2007).These include several superordinate categories, i.e., Social, Affective, Cognitive, Biological processes, and Relativity.Each of these has several lower-level categories.For example, category Social processes, includes three lower-level categories: Family, Friends, and People.The third group consists of seven Current concerns, representing some of the most frequent themes in various kinds of texts: Work, Achievement, Leisure, Home, Money, Death, and Religion.Fourth group includes Spoken categories that are especially useful for the analysis of oral production (Fillers, Assents, and Nonfluencies).These were included in order to broaden the analyses beyond pure syntax and content characteristics of the text.
1 The term "word stem" has a meaning of the dictionary unit which is not a complete word.
For example, some words are coded in all possible forms (dog -pas (nominative case, singular), psi (nominative case, plural), psu (dative case, singular), etc).On the other hand, some dictionary units, (which are referred to as "word stems") are grapheme patterns with the asterisk at the end, which capture more than one word/word form (e.g.prijatelj*prijatelj /friend/, prijateljstvo /friendship/, prijateljski /friendly/, etc.).Note here that in this sense "word stem" is not necessary lexical or grammatical entity (e.g.jedrenj* -jedrenje / sailing/, jedrenjak /sailboat/, etc.) 2 For a detailed overview of the structure of the English LIWC2007 dictionary, see Pennebaker, et al., 2007.In addition, LIWC2007 provides information about General text descriptors, e.g., word count, percentage of the text covered with the dictionary, number of the words longer than six letters, and frequency of different punctuation signs (Pennebaker et al., 2007).
Even though there is a large body of evidence suggesting that individual differences in word use are related to different important psychological variables, the mechanisms underlying this relationship are yet to be discovered.

Translations and Adaptations of LIWC Dictionary to Different Languages
First LIWC software was using only English dictionary thus; authors used it in psychological research within the English speaking population.Since it proved to be a useful tool in different areas of research, researches started developing dictionaries in different languages.Among the first dictionaries to be developed were Dutch (Zijlstra, van Meerveld, van Middendorp, Pennebaker, & Geenen, 2004), Italian (Alparone, Caso, Agosti, & Rellini, 2004), Spanish (Ramirez-Esparza et al., 2007), and German (Wolf, Horn, Mehl, Haug, Pennebaker, & Kordy, 2008).It is interesting that first translations differed very slightly from English dictionary due to linguistic similarities between these languages.
However, development of some other dictionaries, like French (Piolat, Booth, Chung, Davids, & Pennebaker, 2011) and Chinese (Huang, Chung, Hui, Lin, Seih, Chen et al., in press) was very time consuming, since it demanded alterations in the software itself in order to make text analysis possible.Namely, Chinese version of the software (C-LIWC) had to be able to make segmentation of the words before processing the text, while French had to allow inclusion of accent markers in the analysis.Beside these, Arabic (Hayeri, Chung, & Pennebaker, 2010), Russian (Kailer & Chung, 2011), Turkish (Murderrisoglu, 2011), and Korean (Lee, Shim, & Yoon, 2005) dictionaries were developed.
All adaptations of the LIWC software, except for the Arabic, Turkish and Russian (to the best of our knowledge) were empirically validated and demonstrated to be useful tool in psychology research beyond English speaking countries.For example, Spanish LIWC demonstrated usefulness in research of depression (Ramírez-Esparza, Chung, Sierra-Otero, &Pennebaker, 2009), bilingualism, andpersonality (Ramírez-Esparza, Gosling, Benet-Martínez, Potter, &Pennebaker, 2006).Korean LIWC was used in the analysis of political speeches (Chung & Park, 2010), research on relations between verbal outputs and age (Lee, Park, & Seo, 2006), and for the investigation of relations between basic personality structure and frequency of different word categories usage (Lee, Kim, Seo, & Chung, 2007).

Serbian LIWC Dictionary-LIWCser
Basis for the development of the LIWCser dictionary was LIWC2007 English dictionary.In addition to, we have used existing adaptations of this software to model Serbian dictionary.Serbian dictionary works with the same software as other LIWC2007 dictionaries.This means that the text analysis is conducted in the same successive manner and that the structure of the output is the same for all LIWC2007 adaptations.LIWCser dictionary corresponds to other dictionaries, with respect to formal and characteristics of the content.LIWCser consists of 12103 words and word stems classified into 65 categories.Table 1 shows LIWCser categories with representative examples of words.The construction of LIWCser has gone through several phases.First, we have translated all the words from English dictionary, and added synonyms, antonyms and jargon words.Content of Linguistic categories was defined upon word-lists for grammatical categories given in Serbian grammar book (Klajn, 2005), so that these categories would be representative for Serbian language.Than we have applied appropriate inflections to all the words from the initial pool.The following step included classification of the words into categories defined by LIWC2007 dictionary.In this step, five raters classified each word into one or more categories by joined consensus of all five.In the final phase, two independent judges reviewed content of all categories and added some culturally specific words.
In the construction of LIWCser, we have paid a significant attention to linguistic and cultural context of future use of the program.Specific characteristics of Serbian language and culture were included in the dictionary, which resulted in certain deviations from English.For example, due to grammar differences category Articles was excluded, while categories Superlative and Negative words were added to LIWCser, because of their single word representations in Serbian.Furthermore, in English dictionary categories Present, Past, and Future include verbs in deferent tenses, while in Serbian version they were replaced with adverbials since most of the tenses in Serbian do not have single word representation.Finally, adding culturally specific words enriched the content of some categories.For example, words that represent important aspects of Orthodox Christian religion were added to the category Religion, words that mark different family relationships were added to category Family, most common informal and swear words were added to the category Swear, etc. (for details of the LIWCser construction see Bjekić et al., 2012).
With 12103 words and word stems, Serbian dictionary is larger than English (4500), Dutch (6568), and Spanish (7515), but smaller than French (39230).The basic reasons for this are differences between languages.For example, Spanish adjectives are gender specific and it led to a larger number of word stems in the dictionary (Ramirez-Esparza et al., 2007), while French dictionary has almost nine times more word stems compared to English, due to large number of synonyms, and different word forms (Piolat et al., 2011).Large number of words and word stems in Serbian dictionary results from developed inflexional morphology, large number of semantically similar words, slang, and culturally specific words that were included.
The largest number of word stems in LIWCser was classified in categories Affective and Cognitive processes, similarly to other LIWC adaptations (e.g., Alparone et al., 2004;Pennebaker et al., 2007;Ramirez-Esparza et al., 2007;Wolf et al., 2008;Zijlstra et al., 2004), due to psychological relevance of these categories (for the overview see Chung & Pennebaker, 2007;Tausczik & Pennebaker, 2010).In order to avoid misclassification in the text analysis, during the classification of the words into categories, authors decided to exclude from the dictionary all words that would fit into different categories when used with different meanings in different contexts (Bjekić et al. 2012).

Aim of the Research
Variety of information that automatic text analysis, and LIWC specifically provides, influenced expansion of use of this software.Development of the dictionary in several languages, enabled research in non-English speaking countries and cross-language evaluation of the findings obtained in Englishspeaking regions (Kroner-Herwig, Linkemann, & Morris, 2004;Lee et al., 2007;Yogo & Fujihara, 2008).Further, it enabled cross-cultural comparisons, bilingualism research, research of second language acquisition, follow-up of the vocabulary development in different communities, and gaining insight into psychologically relevant linguistic aspects of different languages (Kim, 2008;Ramirez-Esparza, Gosling, Benet-Martinez, Potter, & Pennebaker, 2006).Finally, development of the dictionary for automatic text analysis in different languages provides an opportunity to larger number of researchers to investigate relations between psychological phenomena and language.
The aim of this paper is to present data about psychometrical properties of the Serbian dictionary for the LIWC software -LIWCser (Bjekic et al., 2012).In order to assess quality of LIWCser, since it has certain specificities resulting from inter-language differences (e.g., authors had to make specific decisions about certain categories in the process of construction), several aspects of the dictionary were tested.First, equivalence of results obtained with LIWCser and LIWC2007 was analysed.Second, we assessed efficacy of the dictionary when processing different forms of texts, i.e., comprehensiveness of the LIWCser dictionary.In addition, average representation of each of the category in different types of the texts was calculated, in order to gain information about the influence of specific context, which depends on the type of the text that is analysed.Finally, we tested the impact of the homonymous words exclusion on the comprehensiveness of the analyses.

Equivalence of LIWCser and LIWC2007
In order to assess generalizability of the results obtained with Serbian dictionary to the results obtained with English LIWC dictionary, we tested the equivalence of dictionaries on the parallel Serbian-English sample of texts.

Method
Sample.For equivalence testing a sample of 141 texts was used, out of which 46 (32.6%) were abstracts of scientific papers, 54 (38.3%) were newspapers articles, and 41 (29.1%) were movie subtitles.Each text was in both Serbian and in English; specifically, abstracts and newspapers articles were originally in Serbian but then translated to English, while movie subtitles were originally in English, and then translated to Serbian by a professional 4 .When discussing sample size on the level of words, it is satisfying since it covers more than 35000 words (Wolf, et al. 2008).
Scientific journal abstracts were selected from different issues of journal Psihologija published between 2000 and 2008.Criteria for abstract selection were to have texts representing majority of fields in psychology, and to have abstracts with highest quality of translation from Serbian to English.
Newspapers articles were selected from electronic version of JAT revija magazine 5 , which was chosen for several reasons.First, magazine is bilingual where professionals translated each text in full length to English.Second, magazine covers different topics (e.g.culture, leisure, sport, and politics) and formats (e.g.reports, interviews).These topics are relatively equally represented, which adds to diversity of content and writing styles.Finally, all articles have satisfying length, which adds to the reliability of analysis.All articles from the period between October 2010 and May 2012 were analysed.
Movie subtitles and their translations that were included in the analyses were downloaded from the internet 6 .Eleven subtitles of the movies nominated for the American Academy Award from the period between 2007 and 2011 were selected.We have divided each film into 3 to 5 parts equal in length.Subtitles were included in the analysis, because of the similarities between everyday language and the one used in the movies.Therefore, it was possible to observe differences in representativeness of LIWC categories in oral and written language.
Text analysis.No text corrections were made before processing, i.e., we did not correct possible printing errors nor did we exclude words that could be irrelevant for the analysis (e.g., personal names).English and Serbian texts were analysed with LIWC2007 English and with LIWCser, respectively.Data analysis.Equivalence between dictionaries was conducted in a similar way as in German adaptation of LIWC (Wolf et al., 2008).The overall number of various texts belonging to the three aforementioned types was 282 (141 in Serbian and 141 in English).LIWC categories were calculated both for Serbian and English language and stored in the database (texts in the database emulated subjects, i.e., texts were stored in rows, whilst LIWC categories for both languages were presented in columns).Descriptive statistics indicating representation (% of each category in given text) and variability of different LIWC categories were calculated for all texts separately for English and Serbian versions.For the assessment of equivalence between English and Serbian LIWC dictionaries, rang-correlations were calculated (instead of Pearson correlations), thus avoiding potential problems resulting from extreme values and usually non-normal distributions of the LIWC categories (Wolf et al., 2008).As the primary measure of equivalence of two dictionaries, we used coefficient of intraclass correlation (ICC) Two-way mixed effect model, Consistency type.This measure directly reflects the proportion of between-texts variance (similar LIWC category values for both languages within a particular text) in the overall variance (between-texts + within-texts variance).Both measures of equivalence were calculated for each of the LIWC category across all texts.

Results
Serbian texts have on average 300 words less than parallel texts in English, i.e., on average in English texts there are two words per sentence more than in Serbian.In addition, English texts have higher percentage of function words (about 50% in English in comparison to 30% in Serbian).Highest difference is in the frequency of first person singular pronouns, which in English is 7% of all words in the text, while in Serbian these are about 3%.
Average correlation of pairs of Serbian and English LIWC results was .65,and average intraclass coefficient (ICC) was .70,where 76% of categories had correlations higher than .60.Analysis of LIWC2007-LIWCser equivalence across various text types revealed that the high level of equivalence exists across all three types of texts, and the types of text influenced LIWC2007-LIWCser equivalence to some extent (Appendix 1).Thus, for scientific articles equivalence is .69 on average, for movie subtitles .71,and for newspaper articles .75.

Discussion
When we compare formal characteristics of the texts in English and in Serbian, differences in total word count and average number of words per sentence are noticeable.This is a result of grammar differences in the languages.For example, English has articles that do not exist in Serbian.In addition, there is a difference between proportions of function words in the text between Serbian and English.This is the consequence of two factor.First, some function words in Serbian are homonyms (e.g., "da" is a conjunction ("to") and assertive word ("yes")), and those words were not included in the dictionary. 7Second, having in mind that Serbian is highly inflective language considerable differences in syntax structure exist between Serbian and English.For example, verbs in Serbian have suffices marking person in all verb forms.Consequently, in sentence construction it is not necessary to use pronouns, while in English, use of pronouns is obligatory.It leads to the smaller number of function words in Serbian 8 .
Average equivalence between the LIWCser and LIWC2007 is satisfying compared to same measures between English and some other LIWC dictionaries.For example, German version on the standardized sample of the texts demonstrated almost the same level of equivalence with English dictionary as Serbian dictionary (average ICC=.70, and average correlation .68)(Wolf et al., 2008).Demonstrated level of equivalence between LIWCser and LIWC2007 can be considered very good having in mind the differences in the dictionaries itself (i.e., languages are different and there are differences in the classification of the words into different categories), and differences in the quality of the translation of various forms of texts.The results of the equivalence analyses of different types of texts testify about the difference in the quality of the translation.Namely, highest level of equivalence was in newspapers articles translated by professional translators and the lowest was for abstracts of scientific papers where authors were more preoccupied with presenting basic data about the research than with the stylistic and formal characteristics of the translation.
Linguistic categories in LIWCser were classified according to grammar rules.Therefore, the differences in linguistic categories between Serbian and English LIWC versions will reflect the difference in grammar rules of the languages.For example, compared to LIWC2007, LIWCser contains relatively 7 Impact of homonymous words exclusion on the comprehensiveness of the LIWCser is discussed in further section.8 For example, in Serbian both sentences Ja idem kući kolima (I go home by car) and Idem kući kolima (Go home by car) are gramatically correct, where construction with the pronoun is less often used, since pronoun I is gramatically redundant in this example.
small number of word stems representing auxiliary verbs (144 compared to 28, respectively).In addition, Serbian dictionary contains lower number of prepositions (60 compared to 49, respectively), but larger number of adverbs (69 compared to 154, respectively).Number of word stems in other linguistic categories is relatively equal in LIWCser and LIWC2007.
On the level of specific categories, Present and Past have lower level of equivalence.This is probably due to differences in content of these categories in LIWCser and LIWC2007 (Bjekić et al., 2012).In addition, results for the category Inclusion do not indicate equivalence of the two dictionaries.Possible reason for this is that authors of LIWC2007 did not provide an explicit criterion for classification of words into categories.Therefore, it is possible that in LIWCser construction we have used different criteria than LIWC2007 constructors when selecting words for this category.Similar issue was noticeable in some other LIWC adaptations (e.g., Ramirez-Esparza et.al, 2007).
When it comes to paralinguistic categories, lower equivalence is a result of small sample of words belonging to this category in the text (which is expected since we did not analyse spontaneous speech) and of differences in transcribing.On the other hand, categories filled-in with culturally specific words belonging to categories Religion, Family, and Leisure, demonstrated high equivalence, which speaks in favour of the decision to add those words during the process of dictionary development.
Findings showed that LIWCser has satisfying equivalence with LIWC2007 dictionary, with the exception of few categories.

Comprehensiveness and Representation of the LIWCser Categories
If we want to have reliable results in the automatic text analysis, it is necessary to include in the dictionary words that are representative for specific category.However, representativeness of the categories is not possible to assess directly (Pennebaker et al. 2007).Usually, measure of comprehensiveness of the dictionary, i.e., percentage of the text covered with a dictionary9 , serves as an indicator of software's "goodness of fit".If the percentage of the words not covered by the dictionary is relatively small, the analysis is more comprehensive and therefore results are considered as more reliable.
When applying automatic text analysis in psychology, researchers often have problems with the interpretation of the results obtained for different categories.Namely, relative representation of each category partly results from the type of the text that is analysed and from its style.In order to have insight into expected values of different categories, we have investigated differences in representation of different categories depending on the type of the analysed text.

Method
A sample of 386 texts was used, out of which 141 was used for the assessment of the equivalence of LIWCser and LIWC2007 (i.e., scientific abstracts, newspapers articles and movie subtitles).Of the remaining 245 texts, 140 were short stories 10 written by psychology students as part of research conducted by Lazarević (2012) and 105 were short essays where respondents were reporting about their attitude towards homosexuals (Bjekić, Živanović, & Žeželj, 2012).To sum up, five different types of texts were analysed: abstracts of scientific papers, newspaper articles, movie subtitles, short stories, and essays.Each text from the corpus of short stories and short essays was processed with LIWCser.

Results
LIWCser dictionary covers on average 69.93% of words in the texts.As seen from the Table 3, representation of the categories differs depending on the type of the text that is analysed.Comprehensiveness of LIWCser dictionary is highest for essays and short stories, while it is lowest for abstracts of the scientific papers.Depending on the type of the text, differences in the average representation of different LIWCser categories occur.Largest differences occur in linguistic categories, which are the best indicator of the writing style, and in psychological categories.For example, frequency of first person pronouns is higher in essays about specific topic than in other types of texts (i.e., in abstracts of scientific papers words from these categories are almost absent).When it comes to psychological categories, slightly higher values are obtained for essays, short stories and movie subtitles, compared to abstracts of scientific papers and newspapers articles.Table 4 presents the representation of LIWCser categories for different types of texts.

Discussion
Results on comprehensiveness of LIWCser dictionary demonstrate that it is possible to extract reliable information about the texts that are analysed.When percent of words covered by LIWCser dictionary is compared to other LIWC dictionaries, we observe that Serbian dictionary covers on average larger percentage of the text than French (54%) (Piolat et al., 2011), German (63%) (Wolf et al., 2008), and Spanish (66%) (Ramirez-Esparza et al., 2007), and the same as Dutch (70%) (Zijlstra et al., 2004).Therefore, we can conclude that Serbian LIWC dictionary is quite successful when it comes to dictionary comprehensiveness, i.e., reliability of the information obtained.
The differences in percent of words covered by dictionary for the different types of texts are in line with the expectations.Namely, the lowest coverage is for scientific abstracts, while the highest is for the short stories and essays.It is quite understandable since scientific abstracts mostly consist of specific terms, and LIWC does not contain professional terminology because its primary purpose is the analysis of everyday verbal output (Pennebaker et al., 2007).Style of short stories and student essays is relatively informal and closest to the everyday speech.
The coverage of different types of texts in Serbian is similar to the results obtained in English.The results of the validation of LIWC2007 demonstrate lowest percentage of coverage for scientific abstracts (53%), and highest for oral speech (91%), and emotional writing (93%) (Pennebaker et al., 2007).These results suggest that LIWC software is largely adapted for the analysis of everyday oral and written language, both in English and in Serbian.
Displayed results about representation of each LIWCser category in different types of texts provide insight into how values for different categories vary across different text types.These values are descriptive.One should bear in mind that they are not obtained on representative sample of specific type of the text, and that they serve more as a general tendency than as a norm.In other words, it is advisable to use this information as a general guideline about the basic characteristics of different types of verbal outputs when interpreting results.For example, scientific abstracts usually have longer sentences, lower proportion of function words, relatively rare use of pronouns and negations and lack of informal words.These tendencies are in accordance with linguistic characteristics of scientific style, e.g., monolog character, use of normed speech, and higher saturation of the text with a meaning (Simić, 2001).Characteristics of newspaper articles are middle long sentences; use of less affective words and words marking cognitive processes in comparison to other kinds of texts, which indicate objectivity and restraint in expression, which are standard characteristics of these kinds of writings (Katanić-Bakaršić, 1999).Movie subtitles were included because they are highly representative of everyday speech.Therefore, they usually have short sentences, frequent use of personal pronouns, content refers to present tense, have informal words, etc.Short stories and essays represent written form of everyday speech.Characteristics of these kinds of texts are higher frequency of function words, and more frequent use of pronouns and verbs (i.e., sentences with basic structure) 11 .
Differences in percentages of LIWC categories depending on the type of the text stress the importance of both the context in which verbal communication takes place, and of validity of content of specific categories (Pennebaker et al., 2007).In other words, it is expected that texts written with different aims and in different contexts diverge in style and content.If a software for ATA assesses those differences and if they are interpretable (i.e., if the results provide information in line with general characteristics of specific type of the text), we can consider specific software as a valid instrument.

Impact of Homonymous Words Exclusion on the Comprehensiveness of the LIWCser
Unlike authors of LIWC, during dictionary construction we have decided to exclude all the words that could be classified into different categories depending on the context (i.e., homographs and homophones).Although this decision resulted in lower number of words in the dictionary and led to lower percentage of the text coverage in the analyses, we have avoided misclassification of the words as much as possible and consequently lowered the measurement error.In order to have an idea about the percentage of the words that were left out from the analyses due to exclusion of the homonyms, the percentage of the excluded homonymous words across texts has been calculated.

Method
Additional dictionary for homonyms was constructed and it included 323 word stems that were initially excluded from LIWCser due to homonymy.In this dictionary, 8.7% were function words.The same sample of 386 texts was processed again.

Discussion
Results show relatively low number of homonymous words in analysed texts.If the homonyms were included in LIWCser dictionary, its comprehensiveness would be on average 75%, instead of 70% as demonstrated in previous analyses using LIWCser dictionary.In other words, exclusion of this 11 Sentences with basic structure consist of a minimum number of words that can convey certain meaning.Basic structure of the sentence usually consists of three constituents in canonical word order.
type of words did not significantly reduce the quality of LIWCser dictionary in terms of its comprehensiveness (from 75% with homonyms included to 70% without homonyms).In addition, it empirically supports primary decision to exclude homonymous words in order to avoid the possibility of misclassification of such words during the text analysis.However, since authors of other LIWC adaptations did not report results on homonyms analyses, question remains whether these results can be cross-linguistically generalised.

General Discussion and Conclusion
Use of automatic text analysis, and specifically LIWC software recently became more frequent in psychological research in English and in non-English speaking countries (see Pennebaker et al., 2003;Tausczik & Pennebaker, 2010).This kind of text analysis has several advantages.ATA enables researchers to have objective quantitative data on large number of different content and stylistic characteristics of the text, and application of various statistical analyses.In addition, analysis is simple, reliable, low-cost, and sample is relatively easy to assemble (i.e., we can use internet, e-mails, literature, speeches, etc.).
All analyses demonstrated a satisfying level of equivalence between Serbian and English version of the dictionary, which enables cross-language evaluation.Empirical evidence from this study validates LIWCser as a method strong enough to analyse texts in Serbian with the same quality as LIWC2007 processes English verbal products.Although some categories did not have high level of equivalence, results revealed that overall LIWCser shows similar level of equivalence as other translations of the dictionary.
High percentage of coverage of the text, and stability in the percentage of coverage depending on the type of the text, provides more evidence on validity of LIWCser as assessment method in psychology research.Overall, LIWCser performs similar to LIWC2007.Specifically, results demonstrated that LIWCser performs better when processing texts with more informal style, compared to more formal texts.This adds to the validity of the LIWCser as an instrument designed to analyse texts saturated with psychologically relevant content.
Final study related to homonymous analysis demonstrated that the decision to exclude relatively small percentage of words so possible wrong classification could be avoided, proved to be good.On average, only 4-5% of the words that were not initially classified with LIWCSser belong to the group of homonyms.This result supports the decision to add on reliability of the classification by excluding potentially misclassified words.
To conclude, several arguments go in favour of LIWCser as a good instrument for the analysis of the texts in Serbian.First, since the basis for the development of LIWCser was English dictionary, researchers have clear theoretical and methodological framework.Second, all analyses indicate good psychometric properties of the instrument.In addition, LIWCser is very userfriendly and it offers possibility to create new categories depending on the need of the researcher.Finally, during development of LIWCser, significant attention was paid to cultural and linguistic specificities of Serbian language.It would be a useful tool for all professionals interested in studying various aspects of linguistic behaviour, especially spontaneously produced verbal material.

Table 1 .
LIWCser categories with representative examples of words.

Table 2 .
Table 2 presents descriptive statistics for each category in English and Serbian LIWC dictionary and data on dictionary equivalence.Average ICC for Linguistic categories was .74,for Psychological was slightly lower (ICC=.72),and for Personal concerns it was highest (ICC=.75).On the level of specific categories highest equivalence was observed for Religion (ICC=.96),Family (ICC=.96),Negations (ICC=.95),Sex and love (ICC=.93),Sadness (ICC=.92),Achievement (ICC=.91),and Leisure (ICC=.90).Categories with the lowest equivalence were Present (ICC=.30),Anger (ICC=.29),and Feeling (ICC=.24),while for categories Past and Inclusion ICCs were close to zero.Equivalence of LIWC2007 and LIWCser

Table 4 .
Descriptives on LIWCser categories for different kinds of texts and results of Kruskal-Wallis test

Table 4 .
Descriptives on LIWCser categories for different kinds of texts and results ofKruskal-Wallis test (continued)

Table 4 .
Descriptives on LIWCser categories for different kinds of texts and results ofKruskal-Wallis test (continued)