Testing the Model of Psychological Flexibility in the Serbian Cultural Context: The Psychometric Properties of the Acceptance and Action Questionnaire

The aim of this study was to examine the theoretical model of psychological flexibility proposed by creators of Acceptance and Commitment Therapy. To do this, we investigated the structural and convergent validity of the Acceptance and Action Questionnaire – II, translated into the Serbian language. The study was performed on 1781 Serbian speaking adults. By methods of robust confirmatory factor analysis, two models have been tested. The best fit was achieved with the data-driven model constructed by adding three pairs of correlated residuals to the original hypothesized model of latent structure. This model demonstrated it`s structural invariance across samples divided by gender and by previous experience with services provided by mental health professionals. Excellent convergent validity of the scale was demonstrated through adequate correlations with indicators of subjective well-being, styles of emotional regulation, and coping strategies. Our findings suggest that the Serbian adaptation of the Acceptance and Action Questionnaire – II is a valid and reliable measure of behavioral avoidance, implicating that the construct of psychological flexibility has been demonstrated as conceptually similar to those previously found across various languages and cultural contexts.

In addition to these conceptual problems, some recent findings based on item response analysis reported that the AAQ-II does not appear to be a psychometrically perfectly sound instrument (Ong, Pierce, Woods, Twohig, & Levin, 2019) as could be concluded from all of the research based on classical test theory (e.g., Bond et al, 2011;Fledderus et al., 2012;Gloster et al., 2011;Kleszcz,et al., 2018;Pennato et al., 2013).
Despite this criticism, there is a growing pool of studies examining experiential avoidance related to specific conditions using the AAQ and AAQ-II, such as addiction severity (Forsyth, Parker, & Finlay, 2003), the development of anxiety problems (Karekla, Forsyth, & Kelly, 2004), post-traumatic stress disorder (Marx & Sloan, 2005), disability and suffering associated with chronic pain (McCracken, Vowles, & Zhao-O'Brien, 2010), mental health stigma (Masuda, Price, Anderson, Schmertz, & Calamaras, 2009), and performance at work (Bond & Bunce, 2003), among others. In most of these studies, experiential avoidance was found to be a transdiagnostic deleterious process resulting in a state of lower functionality, or as a predisposing factor placing people at risk of developing psychopathological symptoms. Although higher levels of experiential avoidance have been associated with higher risks of many forms of psychopathology (Levin et al., 2014), the potential of the AAQ-II in discriminating clinical and nonclinical samples has not been adequately demonstrated (Karekla & Michaelides, 2017;Tyndall et al., 2019). In the past decade, there has also been a noticeable trend in developing variations of the AAQ in more disorder-specific manner, in order to evaluate the role of experiential avoidance in particular conditions, including psychosis (Shawyer et al., 2007), chronic pain (Vowles, McCracken, McLeod, & Eccleston, 2008), social anxiety (MacKenzie & Kocovski, 2010), body-image (Sandoz, Wilson, Merwin, & Kellum, 2013), substance abuse (Luoma, Drake, Hayes, & Kohlenberg, 2011), smoking dependence (Gifford et al., 2002), and weight-related issues (Lillis & Hayes, 2008). As much as their results are promising and highlight the importance of experiential avoidance, most of this research was conducted with AAQ, i.e., before the development of the AAQ-II. Given the substantial problems with the original AAQ (Bond et al., 2011;Hayes et al., 2006), the additional research update using the AAQ-II is needed to support the construct of experiential avoidance related to these specific conditions.
Although the excellent psychometric properties and structural validity of the AAQ-II have been repeatedly demonstrated across various cultural contexts, we find there is a need for further cross-cultural validation of the AAQ-II, particularly in the specific social context of the Western Balkans, following the general assumption that specific cultural and societal factors may play an important role in determining one's psychological characteristics. In order to enable the potential direct comparison of new research findings regarding the role of experiential avoidance in Western Balkans to the existing research pool, as well as participation in future cross-cultural research, it is necessary to assess the psychometric properties of the AAQ-II translation in this unique part of the European cultural context and make the questionnaire available for clinical and research use. The general aim of this study is to explore the validity of the model of psychological flexibility in Serbia through examining the instrument measuring the construct hypothesized by the model. In order to do so, we explored the basic psychometric properties of the AAQ-II translated into Serbian, i.e. evaluated internal consistency, as well as structural and convergent validity of the scale in this specific sociocultural context.

Method
The specific aims of this study were: (1) to investigate the descriptive statistics of the Acceptance and Action Questionnaire -II (AAQ-II; Bond et al., 2011) translated into Serbian language, (2) to investigate the hypothesized latent structure model of psychological flexibility on the Serbian translation of the Acceptance and Action Questionnaire -II by methods of confirmatory factor analysis, and (3) to investigate the convergent validity of the scale.
We hypothesized high internal consistency and unidimensional latent structure of the scale with significant factor loadings for all of the items. We expected to fully replicate the original unidimensional model of AAQ-II (Bond et al., 2011) and to demonstrate structural invariance of the model across subsamples divided by variables of gender and previous experience with services provided by mental health professionals. Finally, we expected significant and adequate correlations of the AAQ-II score with indicators of subjective wellbeing, styles of emotional regulation, and coping strategies, indicating convergent validity of the scale.

Participants
A total of 1781 Serbian adults (54 % females; Mean age = 30.16, SD = 10.31, age range 19-80) participated in this study. Table 1 presents a detailed description of the sample. Participation in the study was voluntary and anonymous and respondents did not receive any compensation for their participation. Participants were recruited using a combination of convenience sampling and the snowball sampling method (Goodman, 1961) initiated by the authors. 180 of our students from Banja Luka, Bosnia and Herzegovina, and Novi Sad, Serbia, were asked to recruit 10 individuals from their family and social surroundings. Participation in the study was voluntary and anonymous. Neither recruiters nor respondents received any compensation for their participation.

Instruments
A demographic survey was used to assess and array of demographic variables that included age, gender, marital status, education, employment status, and previous experience with professional mental health services.
The Acceptance and Action Questionnaire -II (AAQ-II; Bond et al., 2011) is a 7-item self-report questionnaire aimed at assessing psychological inflexibility, or experiential avoidance. The total score (ranging from 7 to 49) indicates the level of experiential avoidance -the lack of ability to be in connection with present thoughts and feelings without needless defense or to behave in accordance with personal goals and values (Hayes et al., 2006). The instrument was translated into Serbian using the back-translation procedure. One professional translator translated the scale to Serbian, followed by another independent translator who back-translated it to English. After comparing the back-translation to the original items, neither of the translators found any significant differences in item contents. The original scale, as well as the Serbian translation, are publicly available.
The Depression Anxiety and Stress Scale -21 (DASS-21; Lovibond & Lovibond, 1995) was used to assess negative affective states. The DASS-21 consists of 21 items and includes three subscales: depression, anxiety, and stress. Responses are rated on a 4-point scale, from 0 (did not apply to me at all) to 3 (applied to me greatly, or most of the time). The DASS-21 translation into Serbian is widely used and has shown good reliability on a sample of adults with alphas for depression, anxiety, stress, and general psychological distress of .85, .81, .84, and .92, respectively (Jovanović, Žuljević, & Brdarić, 2011), as well as on a sample of adolescents -.87, .82, .86 and .92, respectively (Jovanović, Gavrilov-Jerković, Žuljević, & Brdarić, 2014). PSIHOLOGIJA, 2020, Vol. 53(2), 161-181 The Serbian Inventory of Affect based on the Positive and Negative Affect Schedule-X (SIAB-PANAS; ) is a Serbian translation and adaptation of the Positive and Negative Affect Schedule-X (PANAS-X; Watson & Clark, 1994). For this study, we used the short form to measure Positive Affect (PA) and Negative Affect (NA), with ten items each. Participants were asked to report how they felt in general, using a 5-point Likert scale ranging from 1 (never or almost never) to 5 (always or almost always). The scale demonstrated excellent psychometric properties and good reliability with Cronbach's alphas of .85 and .83 for PA and NA respectively (Novović, Mihić, Tovilović, & Jovanović, 2008).
The Satisfaction with Life Scale (SWLS; Diener, Emmons, Larsen, & Griffin, 1985) was used to assess life satisfaction. The responses for each of the five items range from 1 (strongly disagree) to 7 (strongly agree). This translation of the scale is widely used and shows good psychometric properties with a Cronbach's alpha ranging from .81 to .83 (Jovanović, 2016;Vasić, Šarčević, & Trogrlić, 2011).
The Affective Style Questionnaire (ASQ; Hofmann & Kashdan, 2010) was used to assess the propensity for using three basic styles of emotional regulation. Participants were asked to rate how true each of the 20 items seemed to them, ranging from 1 (not true at all) to 5 (completely true). The scale contains three subscales: Concealing, Adjusting, and Tolerating. Previous research using Serbian translations has demonstrated fair psychometric characteristics with Cronbach's alphas of .85 for Concealing, .81 for Adjusting, while Cronbach's alpha of .58 for Tolerating had not reach the acceptable level (Žuljević, Radović, & Gavrilov-Jerković, 2013).
The Coping Strategy Indicator (CSI; Amirkhan, 1990) was used to assess three basic coping strategies. Participants were asked to select an important stressful event in their lives within the past six months, and briefly describe it. Keeping that event in mind, participants were asked to respond to 33 items by indicating on a 3-point Likert Scale ranging from 1 (not at all) to 3 (a lot) to what extent they used the described strategy while dealing with the event they remembered. The scale consists of three subscales: Problem solving, Seeking social support and Avoidance, each containing 11 items and providing a total score ranging from 11 to 33. Higher scores for a strategy indicate the greater use of the strategy. The Serbian translation demonstrated fair psychometrical characteristics with Cronbach's alpha for Concealing, Adjusting, and Tolerating of .91, .92, and .75 respectively (Žuljević, Jovanović, & Gavrilov-Jerković, 2015).

Data Analytic Strategy
There were no missing data in the final data matrix. 19 of the initial 1800 participants had missing data and were omitted from the sample, without assessing their randomness. Descriptive statistics (means, standard deviations, skewness, and kurtosis) were calculated for all items of the scale. In order to evaluate the internal consistency of the instrument, Cronbach's alphas were computed with 95% confidence interval, followed by corrected item-total correlation, squared multiple correlation and Cronbach's Alpha if item is deleted for each of the items.
In order to test for differences between genders and between groups with and without previous experience with professional mental health services, independent sample t-tests were conducted. Cohen's d effect sizes were calculated for t-tests. Effect sizes of .20, .50, and .80 were considered small, medium, and large, respectively (Cohen, 1988). Descriptive statistics (means, standard deviations, skewness, kurtosis, mean item-total correlation and Kolmogorov-Smirnov test for normality) were calculated on the total sample, as well as on mentioned subsamples.
The general hypothesized latent structure model (Bond et al., 2011) was tested by robust confirmatory factor analysis (CFA) using the EQS 6.1 for Windows (Bentler, 2006). For fit estimation, the following criteria were used: the Satorra-Bentler chi square (SBχ2), the SBχ2 ratio (SBχ2/df), the Root mean square error of approximation (RMSEA; Steiger, 2016), the Standardized root mean square residual (SRMR), the Comparative fit index (CFI; Bentler, 1989), and the Bentler-Bonett normed fit index (NFI; Bentler & Bonett, 1980). Good fit indices are considered SBχ2/df < 3, RMSEA and SRMR < .05, CFI and NFI > .90 (Hu & Bentler, 1998Kline, 2005;Schumacker & Lomax, 1996). The model was adapted based on a sequential fit diagnostic by Lagrange multiplier test and compared to the original model. Additional multi-group analyses were conducted to test for measurement invariance of this data-driven model adaptation across gender and previous experience with services provided by mental health professionals. Successively more restrictive models of invariance (unconstrained, measurement weights, and structural covariances) were compared with χ² tests. Insignificant χ² differences between increasingly constrained models followed by differences in CFI lower than .01 (Cheung & Rensvold, 2002;Putnick & Bornstein, 2016) indicated invariance. All of the analyses were performed in EQS 6.1. Pearson's correlation coefficients were calculated to test the relations between the AAQ-II and the other measures of the study (DASS-21 and its subscales, SWLS and all the subscales of SIAB-PANAS, CSI, and ASQ). Reliability of these scores was tested by calculating Cronbach's alpha with 95% confidence interval bootstrapped on 1000 generated samples.
Pearson's correlation coefficients were calculated to test the relations between the AAQ-II and the other measures of the study (DASS-21 and its subscales, SWLS and all the subscales of SIAB-PANAS, CSI, and ASQ). Reliability of these scores was tested by calculating Cronbach's alpha with 95% confidence interval bootstrapped on 1000 generated samples. Table 2 presents the descriptive statistics for total score as well as for each of the AAQ-II items. All of the items demonstrate excellent item-total correlation ranging from .62 to .79 and, if omitted from the scale, would lower the internal consistency of the scale. The mean score in the overall sample (M = 16.36; SD = 8.39) and reliability (α = .90) are consistent with and within the range of parameters demonstrated by other nonclinical samples, e.g., Greek (Karekla & Michaelides, 2017), Hungarian (Eisenbeck & Szabó-Bartha, 2018), and Polish (Kleszcz et al., 2018). In Table 3 scale descriptive statistics for the AAQ-II scores are presented, both for the total sample, and for subsamples divided by gender and previous experiences with professional mental health services. The female subsample demonstrated significantly higher AAQ-II scores than the male subsample in mean score distribution (t(1779) = 3.63; p < .01; d = .17). Also, the participants with previous experience with professional mental health services demonstrated significantly higher AAQ-II scores than participants who had not had this kind of experience (t(1779) = 6.87; p < .01; d = .38). There was also a minor deviation from a normal distribution of total scores in all the tested samples.

Latent Structure
Bearing in mind that the normalized Mardia coefficient of multivariate kurtosis (g = 39.21) is significantly higher than the criterion (Bentler, 2006), the robust method of estimation was used (Satorra & Bentler, 1994).
Two models were tested. The first one hypothesized a single factor and resulted in fair fit indices, but also suggested some unexplained variance by both general and residual indices (SBχ 2 /df and RMSEA; Table 4). The second model was based on sequential fit diagnostic evaluation according to the Lagrange multiplier test, which indicated that the points of ill fit pertained to the error covariances of three item pairs (Table 5). This model resulted in excellent fit indices and statistically significant factor loadings, both for the total sample (Figure 1), and for the subsamples divided by the variables of gender and presence of previous experience with professional mental health services (Table 4).   Testing the gender invariance of the adapted model (Table 6) by multigroup analysis across gender groups with factor loadings freely estimated demonstrated an excellent fit to the data (χ 2 = 9.89; p = .16; CFI = .00), as did the models with additional constraints imposed (χ 2 = 10.29; p = .12; CFI = .00) for measurement weights and (χ 2 = 12.87; p = .08; CFI = .00) for structural covariance, thus suggesting model invariance across genders. The same can be said for model invariance regarding the presence of previous experience with professional mental health services (Table 6). The freely estimated model demonstrated excellent fit to the data (χ 2 = 6.15; p = .22; CFI = .00), as did the models imposing additional constraints (χ 2 = 7.14; p = .31; CFI = .00 for measurement weights; χ 2 = 14.23; p = .06; CFI = .00 for structural covariance). Note. EPMHS = Experience with professional mental health services; SB 2 = Satorra -Bentler corrected  2 ; RMSEA = Root mean square error of approximation; SRMR = Standardized root mean square residual; CFI = Comparative fit index; NFI = Normed fit index.

Convergent Validity
In order to investigate the convergent validity of the scale translated into Serbian, the general AAQ-II score was correlated with indicators of subjective well-being, as well as with other potentially related constructs -styles of emotional regulation and coping strategies. As seen in Table 7, all of the correlation coefficients are highly positive with indicators of depression, anxiety, stress, and general distress. This finding is consistent with findings reported in previous cross-cultural validation studies using the same instruments (Chang et al., 2017;Kleszcz et al., 2018;Ruiz et al., 2016;Szabó et al., 2011;Zhang et al., 2014). The lowest correlation coefficients were demonstrated in the Romanian sample (.47, .35, and .31 respectfully;Szabó et al., 2011) and the highest in the Colombian sample (.73, .65, and .86 respectfully;Ruiz et al., 2016). As for life satisfaction, the AAQ-II score demonstrated moderate negative correlation with indicators of life satisfaction and positive affect. This result is consistent with 7 culturally specific studies mentioned above, all demonstrating significant negative correlation ranging from -.21 for the Taiwanese sample (Chang et al., 2017) to -.64 for the Polish sample (Kleszcz et al., 2018). The experiential avoidance score also demonstrated moderate negative correlation with positive PSIHOLOGIJA, 2020, Vol. 53(2), 161-181 affect, and high positive correlation with negative affect, which was demonstrated to an almost identical degree in the Taiwanese sample (-.37 and .66 respectfully; Chang et al., 2017), and also consistent with the Chinese sample (-.15 and .54 respectfully; Zhang et al., 2014). Note. M = Mean Score; SD = Standard Deviation; α = Cronbach's Alpha; r = Pearson correlation; CI = 95% confidence interval bootstrapped on 1000 generated samples; *p < .05; ** p < .01.
As expected, experiential avoidance was negatively correlated with emotional regulation strategies of tolerating and adjusting to the presence of unpleasant emotions. AAQ-II score demonstrated a minor negative correlation with problem solving and significant positive correlation with coping styles of behavioral avoidance.

Discussion
The aim of this paper was to examine the theoretical model of psychological flexibility, conceptualized as the central mechanism of change in the Acceptance and Commitment therapy, within the cultural context of the Western Balkans. In order to do so, we investigated psychometric properties of the Acceptance and Action Questionnaire -II (Bond et al., 2011) translated into Serbian language using classical measurement theory. The study was focused on evaluation of the hypothesized structure of the AAQ-II and it's data-driven adaptation, as well as on invariance of the adapted model across the variables of gender and previous experience with professional mental health services on a large sample of Serbian speaking adults.
The descriptive statistics for total score were found to be consistent with previous findings across six European cultures (Monestès et al., 2018). The scale translated to Serbian was found to be reliable and internally consistent, thus providing an initial indication that it can be used as a standard measure of experiential avoidance.
The latent structure of the scale was tested with confirmatory factor analysis. As the indicator of multivariate normality was above threshold, thus exceeding the maximum likelihood solution based on normal distribution theory, the robust estimation method was employed (Satorra & Bentler, 1994). The unidimensional structure of the scale was supported and in line with the findings presented by other authors evaluating the structure of the AAQ-II translated into other languages (e.g., Monestès et al., 2018). However, the best fit was achieved with the subsequent model constructed by adding three pairs of correlated residuals to the original model, as suggested by the Multivariate Lagrange multiplier test. The correlated residuals of items 2 and 3 were previously reported in German (Gloster et al., 2011), Greek (Karekla & Michaelides, 2017), Hungarian (Szabó et al., 2011), and Turkish (Yavuz et al., 2016) samples, and explained as a consequence of similarity in item wording. In contrast to previous findings, the correlated residuals of item 7 with items 1 and 6 are unique to this study. On the other hand, in the available research findings we did not manage to find any measures of multivariate kurtosis which would potentially suggest the usage of robust CFA. This implicates that conclusions made in previous research were possibly derived from the results of CFA methods based on the multivariate normality assumption, which might not be adequate. Although some further research would be recommended in order to investigate this specificity of our model compared to other language specific samples using the same method of robust CFA, we find that the degree of these differences is simply not big enough to suggest the potential cross-cultural differences in psychological flexibility as a construct. Nevertheless, as we managed to explain this additional source of variance, future research should evaluate this model adaptation in all future samples using this translation. Apart from these minor variations in residual correlations, our findings do not suggest any major difference compared to other evaluations of the AAQ-II translated to specific languages in the European cultural context. In addition, the best-fitting model was found to be invariant across variables of gender and subsamples divided by the presence of previous experience with professional mental health services.
The adequate significant correlation with measures of depression, anxiety, stress, general distress, satisfaction with life, as well as with positive and negative affect is consistent with numerous previous findings in Englishspeaking samples (e.g., Fledderus et al., 2012) and nation-specific samples (e.g., Lundgren & Parling, 2016;Zhang et al., 2014). These findings suggest the excellent convergent validity of the scale and also indicate the construct of behavioral avoidance as a diminishing factor of mental health (e.g., Kashdan, Barrios, Forsyth, & Steger, 2006;Kashdan & Breen, 2007).
The convergent validity of the AAQ-II was additionally supported by negative correlations with tolerating and adjusting styles of emotional regulation and the problem-solving coping strategy, and especially the positive correlation with the avoidant coping style. This finding was quite expected, bearing in mind that these constructs are conceptually close to psychological inflexibility defined as unwillingness to experience and be able to act in the presence of a full scope of inner experiences (Hayes et al., 2006). The negative correlation with the coping strategy of problem-solving can also be expected, as psychological rigidity can be viewed as a deteriorating factor for shifting between different strategies and persisting in problem-solving (Kashdan et al., 2006;Kashdan & Rottenberg, PSIHOLOGIJA, 2020, Vol. 53(2), 161-181 2010; Krafft, Hicks, Mack, & Levin, 2018). On the other hand, as the AAQ-II is aimed at assessing the inner processes of a person dealing with private emotional experiences, it happens to be unrelated to communicating these experiences in a social context operationalized by the coping strategy of seeking social support, defined as actively turning to others for comfort, help, and advice (Amirkhan, 1990), as well as by concealing emotional regulation, defined as a social skill of hiding emotional experiences and withholding expression in the presence of others (Hofmann & Kashdan, 2010). The fact that model invariance was not tested across more specific clinical samples represents the most significant shortcoming of this study. Bearing in mind the potential usage of the AAQ-II translation in clinical settings by researchers and practitioners with an ambition to investigate the role of psychological flexibility in various clinical states, the structural invariance of the model and discriminant validity of the scale should be tested across much more specific samples. Furthermore, the sample was collected via convenience sampling and does notaccurately reflect the population structure in Serbia and Bosnia and Herzegovina (e.g., gender and educational distribution). As these circumstances have the potential to reduce the generalizability of our findings, they should be taken into consideration when drawing conclusions, as well as when planning future research. Finally, due to the cross-sectional design, it was not possible to test the temporal stability of the model, as well as test-retest reliability. This could be especially important for the future process-outcome studies based on the presumption that psychological flexibility represents the central mechanism and mediator of change during the course of psychological treatment (e.g., Hayes et al., 2012). Furthemore, due to the fact that the AAQ-II is currently the only instrument originating from the theoretical scope of Acceptance and Commitment Therapy that has been translated into Serbian language, our study failed to address the issue of concurrent validity of the AAQ-II.
We can conclude that, in the cultural context of the Western Balkans, the construct of psychological inflexibility was found to be conceptually similar to those found in other languages and cultural contexts. Apart from the limitations we mentioned, the Acceptance and Action Questionnaire -II translated into Serbian can be recommended for future research including the construct of psychological flexibility, bearing in mind that our sample responded to the instrument in a valid and reliable manner.