Cross-cultural validation of the “ International Affective Picture System ” ( IAPS ) on a sample from Bosnia and Herzegovina

In this study the normative ratings of the International Affective Picture System (IAPS, Center for the Study of Emotion and Attention [CSEA], 1995) were compared with the ratings from a Bosnian sample. Seventy-two psychology undergraduates from the University of Sarajevo (Bosnia and Herzegovina) rated valence, dominance and arousal for a stratified sample of 60 pictures that was selected from the IAPS. Reliability coefficients indicate that the self-report ratings are internally consistent. The affective ratings from our sample correlated strongly with the North American ratings at: .95, .81 and .91, respectively for valence, arousal and dominance. Consistent with expectations, mean valence and dominance ratings did not differ significantly between the Bosnian and North American sample. Furthermore, plotting of the Bosnian valence and arousal ratings results in a similar boomerang shaped distribution as the North American affective ratings. Taken together, findings obtained from the Bosnian sample confirm the cross-cultural validity of the IAPS.

dominance.This relatively simple view hinges on previous seminal work by Osgood, Suci & Tanenbaum (1957) on the semantic differential.In this model a factor analysis conducted on a wide variety of verbal judgments indicated that the variance in emotional assessments were accounted for by three major dimensions: the two primary ones; valence and arousal with a third, less strongly related dimension, called dominance.This dimensional view has been advocated by a plethora of theorists including, Wundt (1898), Mehrabian and Russell (1974) and Tellegen (1985).Each dimension meaning corresponds with a description given in the instructions for the assessment process, seen below.These ratings are reliable (Lang et al. 2005) and have been corroborated by other self-assessment procedures (e.g.Ito, Cacioppo, & Lang, 1998;Kwon et al., 2009), by a range of psychophysiological measures (e.g., Smith, Löw, Bradley, & Lang, 2006), positron emission tomography (Reiman, Lane, Ahern, Schwartz, & Davidson, 2000) and fMRI (Lang, Bradley, Fitzsimmons, Cuthbert, & Scott, 1998).In addition, the IAPS has been successfully used in various psychological studies ranging from selective attention in anxiety (Moog et al., 2000), risk perception and selective memory recall (Drace, Desrichard, Shepperd, & Hoorens, 2009;Drace, Ric, & Desrichard, 2010;Drace, in press) to abnormal affect startle modulation in psychopaths (Levenston, Patrick, Bradley, & Lang, 2000).
To date, several laboratories over the world have found high stability of affective ratings, among which the US (Lang et al., 1999), Germany (Hamm & Vaitl, 1993), Belgium (Verschuere, Crombez, & Kostner, 2001), Spain (Ramirez et al., 1998), Brazil (Lasaitis et al., 2008;Ribeiro et al., 2005) Chile (Dufey et al., 2011), Italy and Sweden (mentioned in Bradley, 1994) suggesting that emotional reactions elicited by IAPS pictures are stables across cultures.In this study, it is investigated whether ratings of a Bosnian sample are comparable to the normative ratings obtained for the North American participants.Using the normative rating procedure, (Lang et al. 2008) Bosnian undergraduate students rated a sample of 60 IAPS photographs on three dimensions: valence, arousal and dominance.In order to test the cultural stability of the affective ratings, the mean affective ratings obtained for each picture in our sample and those obtained in North America will be compared.Second, for each dimension a high correlation is expected between Bosnian and North American sample.And third, we also expected that the plotting of the valence and arousal should result in a typical boomerang-shaped distribution with more extreme ratings on the valence dimension (either positive or negative), receiving a higher score on the arousal dimension.

Method
Participants.A total of seventy-two undergraduate psychology students (55 females) of the University of Sarajevo took part in the study.Participants were tested in 4 separate groups, each consisting of 15 to 20 students.All participants completed a pre-experimental consent form and received course credit for their participation.In comparison, the American ratings were conducted on a similar sample of college students taking an introductory course in Psychology which consisted of approximately 100 participants.
Materials.As in the previous studies (e.g., Lang, Bradley, & Cuthbert, 1999), 60 different pictures were used1 .Digitalized pictures were presented with PowerPoint 2007 using an HP personal computer.Participants were seated approximately 2 to 4 m from a 1.75 x 2.4 m screen on which the pictures of nearly the same size were presented.
Affective ratings were made using a paper-and-pencil version of the Self-Assessment Manikin (SAM; Lang, 1980), which utilizes sequences of humanoid figures to depict gradations along three bipolar affect dimensions: valence (low = unhappy/unsatisfied; high = happy/pleased), arousal (low = calm, relaxed; high = excited, aroused), and dominance (low = submissive, controlled; high = dominant, in control).On each of the three SAM scales, participants were instructed to place an X over the constituent figure that best represented how they felt during the viewing of the last slide; rating values ranged from 1 to 9. (For more details about IAPS rating procedures, see Lang et al, 1995, andpublished reports by Bradley &Lang, 1994;Greenwald et al., 1989;and Lang et al., 1993).Employing this protocol, norms were established for each IAPS slide, locating its mean position on the affective dimensions represented by SAM.
In order to compile a representative sample of the IAPS, we used a stratification procedure to select stimuli from the total set of over 604 pictures (Verschuere et al., 2001).First, slides with mean ratings for all three dimensions, of less than 4 out of 9, were classified as low; those between 4 and 6 were classified as average, and those with mean ratings above 6 were classified as high.Then, using the normative ratings, each picture was classified in one of the 27 resulting strata.For example, picture 9330 (garbage) belongs to the stratum with low valence an average arousing value and an average dominance value.Next, the number of pictures in each stratum in the total sample was counted and a corresponding percentage of pictures were then randomly selected.For example, the first stratum of pictures (high on all three dimensions) consisted of 25 pictures (which is 4% of the total number of pictures), and in turn 4% of pictures from this stratum was then selected for the final stimulus sample.

Reliability
The reliability in this study was calculated through; first the alpha cronbach coefficient for each of the respective dimensions and afterwards, a split half correlation was calculated.The alpha coefficients for valence were .72,for arousal .96and dominance .96For the purpose of the split half correlation, the total group was divided in two: participants with an even versus an uneven participant number.The correlation between the mean ratings of the even and uneven participants was .67 for valence, .92for arousal and .93 for dominance (all p = .01)

Ratings of valence, arousal and dominance
Overall mean for valence was 5.1 (SD = 0.38), for arousal 4.5 (SD = 1.2) and 5.7 (SD = 1.2) for dominance.Mean North American ratings that are based on groups with similar sample size are 4.9 (SD = 1.9) for valence, 5.1 (SD = 1.3) for arousal and 5.2 (SD = 1.2) for dominance.Paired t-tests on the .01level revealed no significant differences for the valence, t(59) = -0.5, and for the dominance ratings, t(59) = -2,5.Mean Bosnian arousal ratings are revealed as significantly higher than mean North American ratings, t(59) = 3, (p = .01).The total correlation between valence and arousal in our sample was -.48, (p = .001),indicating that higher levels or arousal are related to a more negative perception of pictures.The correlation between mean Bosnian and North American ratings are .95for valence, .81 for arousal and .91 for dominance (all p = .001).The mean ratings that were made by male and female participants pooled for all three dimensions were very similar r = .81,(p = .01).The correlation ratings of all dimensions between the North American ratings and men and women are respectively: .89for valence, .93 for arousal and .81for dominance; for men and .96for valence, .87 for arousal, and .93 for dominance; for women (all p = .001).Men do show a higher correlation between valence and arousal at .34 than women at .11 suggesting that they perceive the more positive pictures as more arousing then women.
The overall minimum and maximum ratings are in concordance with the USA sample.The ratings in valence for our sample range from 1.2 (newborn) to 7.5 (cake) compared to 1.6 (burn victim) to 8.34 (puppies) in the American sample of the 60 pictures used in this study.Arousal ratings ranged from 2.6 (iron) to 7 (newborn) compared to 2.41(man) to 7.39 (sky) in the North American.Finally, the dominance ratings have a minimum of 3.0 (sinking ship) to 7.0 (chair) compared to 2.27 (sinking ship) and 7.49 (outdoors) in the North American sample, respectively.

The affective space
The Mean valence and arousal values by the Bosnian sample are plotted (in figure 1.).Dominance has been excluded because it typically explains less of the variance in affective ratings and because the labels of valence and arousal have been used consistently across various IAPS studies (Bradley, 1994).The shape of the affective space is remarkably similar to that in previous studies (e.g.Bradley, Greenwald, Petry, & Lang, 1992;Dufey, Fernandez, & Mayol, 2011).The boomerang shaped distribution of the pictures shows that pictures rated either high or low on the valence scale are also rated as high on arousal.This is shown by the positive trend between valence and arousal for positive pictures (r = .20)and the negative correlation between valence and arousal for negative pictures (r = -.15).Both linear and quadratic values were significant.However a better adjustment was observed for the quadratic function at R 2 = .40compared to the .23 of the linear equation.This is comparable to the North American sample where the quadratic and linear values are .54and .28,respectively.

DISCUSSION
Given the recent resurgence in psychological research to the study of emotions and mood, it helps to have a universally applicable set of stimuli capable of producing the desired effects on all of the demographic strata, thus providing a methodological tool capable of enhancing comparability.Having said that, several labs have already shown the general cross cultural applicability of the International Affective Picture System (e.g., Lang et al., 1999;Hamm & Vaitl, 1993, Ramirez et al., 1998, Dufey, Fernandez, & Mayol, 2010) and in this study we have done the same with a comparable sample from Bosnia and Herzegovina.Our findings strongly suggest that the affective ratings, provided by the Bosnian sample, are comparable to the North American one.We derive our conclusion based on several crucial findings.Firstly, no significant differences were found, on two of the respective dimensions between the ratings of the Bosnian and the North American sample.Second, extremely high correlations were obtained.And third, the shape of the affective space is very similar to other validation studies (e.g.Bradley & Lang, 2007;Gruehn & Scheibe, 2008;Vila et al., 2001) and North American normative ratings, showing that pictures which were rated as either highly positive of highly negative on the valence dimension are also rated highly on the arousal dimension.This is confirmed by a curve estimation showing that the quadratic association is higher than the linear one suggesting that at higher amounts of arousal, both positive and negative valence becomes more robust.On a side note, the minimum and maximum ratings are displayed in congruent manner with the same results from the North American sample compared on a representative sample of pictures showing that pictures with the same content thematically also receive the same min.and max.ratings.
There is, however, a slight difference that appears in our sample, namely, the mean arousal ratings of the Bosnian participants are significantly higher than in the North American ones.Similar results were also obtained in various other validations including those conducted in Spain, (Molto et al., 1999;Vila et al., 2001), Brazil (Lasaitis et al., 2008;Ribeiro et al., 2005), Germany (Gruehn & Scheibe, 2008) and Chile (Dufey et al., 2011).Even Bradley and Lang (2007) have noted the variability in the arousal ratings among cultures, assuming that the IAPS can be sensitive to intercultural differences in emotion disposition, so it can then reliably be used for detecting cross-cultural affective experience.At a theoretical level it can be noted that whereas both the appetitive and defensive systems (detectable on the valence dimension), behave in a more robust way supported by a high consistency of findings across studies in the valence dimension, the arousal sensitivity can be more dependent on cultural aspects.We can therefore hypothesize that there exists a cultural inclination toward emotional expressivity in the sample from Bosnia and Herzegovina (see Dufey et al., 2011;Molto, 2005 andRamirez et al., 1998).Future studies should take this finding into consideration.
One limitation of the present study is that only a selection of the IAPS pictures was rated by the limited number of participants who were undergraduate students of psychology.Other possible limitations include a relatively small number of male participants used in this study and a smaller sample size considering the original 100 in the North American one.One also could mention the relatively smaller alpha values compared to the North American ones (between .93 and .94)for the valence dimension.However, upon close examination of literature (e.g.DeVellis, 1991, who suggests a reliability function as acceptable, between .65 and .70,as well as respectable, between .70 and .80;Schmitt, 1996, a technical paper showing various impediments and pitfalls of alpha usage proclaims that a reliability of .70,±3 is perfectly acceptable) we have corroborated that our values of .72 and .67(split-half) represent values of respectable and minimally acceptable reliability.Future research should concentrate on more participants and different populations (e.g., considering age and gender differences) and/or stimuli that could rule out these possibilities and contribute to overall applicability of IAPS.

CONCLUSION
Our study clearly ads support to the cross-cultural validity of IAPS as well as its methodology and the system's picture content.The overall similarities between the data obtained in our sample and the North American suggest a culturally valid implementation of the IAPS on a Bosnian sample.Furthermore, given the previously established stability of affective evaluations across different countries, our study suggest that the IAPS could be considered as a valid system of affectively eliciting stimuli that is capable of being used as a representative tool in experimental studies and other research possibilities including mood, emotion induction, priming and so forth.

Figure 1 .
Figure 1.A plot of stratified sample of 60 stimuli selected from the IAPS in two-dimensional space defined by valence and arousal.