AR, CR and ARCR modeling for simulations and analyses of karst groundwater quality parameters

Although an invisible component of the hydrologic cycle, groundwater generally takes precedence over other water resources in the area of drinking water supply. Among groundwater resources, karst aquifers tend to be rich in sufficiently-accessible amounts of high-quality water. During most of the year, this water requires only disinfection prior to delivery to the end user. However, in many cases extreme rainfall and/or sudden snow melt results in transient turbidity, increase in bacterial count and temporary contamination (e.g. increase in nitrate and phosphate concentrations). To be able to determine the effect of the precipitation regime on various groundwater quality parameters, it is necessary to establish continuous monitoring of the parameter of interest and certain parameters should be observed at least once a day, if not more often (continuously). Such monitoring provides sufficiently long time-series of the considered parameter, so that autocorrelation and cross-correlation analyses can be undertaken and AR, CR and ARCR modeling used for simulations and short-term forecasts. Apart from the theoretical background, the paper presents a case study of the occurrence of nitrates at a karst spring called “Banja” near the city of Valjevo, Serbia. A ten-year (1991–2000) timeseries of the discharged volume of water was used in the study, as well as nitrate concentrations recorded on a daily basis. In addition, daily precipitation was gauged in the immediate vicinity of the catchment and the rainwater chemically analyzed. The analyses included nitrate concentrations in precipitation. The generated timeseries were used for autocorrelation and cross-correlation analyses of nitrate concentrations in the Banja Spring pool during the entire period of monitoring, as well as in one wet and one dry year. The results are presented for all three cases, based on simulations applying AR, CR and ARCR modeling.


Introduction
Groundwater, the invisible component of the hydrologic cycle, currently plays a fundamental role in maintaining public health and supporting industry and agriculture, as well as entire ecosystems (PETTITA et al., 2015). The importance of groundwater resources is corroborated by the fact that in most of Europe (more than 50% of European countries), groundwater accounts for 70% or more of the public water supply (HARTAI, 2016 * ). From a groundwater quality perspective, water resources formed in karst are specific and generally feature high quality that does not require a lot of spending to ensure compliance with drinking water standards (STEVANOVIĆ et al., 2011). The reasons for this high quality certainly lie in the fact that karst massifs are generally inaccessible, unpopulated and at some distance from large urban agglomerations.
A karst aquifer is a highly dynamic phenomenon in spatial and temporal terms. The regime of a karst aquifer is driven by a series of factors: the geologic framework; structural, tectonic, geomorphological and hydrogeologic characteristics of the setting; and climate. The most important features are (RISTIĆ, 1995): -the thickness of the carbonate rock complex, -the spread of the carbonate rocks, -the tectonic activity that led to the development of regional fault structures, macro and micro fracturing, dipping of strata, etc., -the karstification process and the creation of above-ground and underground morphological features, -groundwater recharge, flow and discharge, -the precipitation and temperature regimes in the area, -the distribution and nature of vegetation, and the like.
All of the above govern the rate of karst aquifer recharge, groundwater flow, discharge regime, variations in physical and chemical characteristics of groundwater, etc. Based on the duration and intensity of changes caused by climate factors, we distinguish perennial, annual, seasonal, daily and hourly regimes, or variations in the quantity and/or quality parameters in these time intervals.
From a drinking water supply perspective, daily fluctuations of a certain parameter are definitely the most important aspect, although seasonal and annual fluctuations are not far behind. On the one hand, daily variations in qualitative and quantitative parameters of karst groundwater can be slight. On the other hand, meteorological factors (early and rapid snow melt or protracted rain) can cause a sudden increase in, primarily, karst spring discharge, but also quality parameters (e.g. turbidity and total bacteria). This shows that the correlation between precipitation and discharge, or between precipitation and a quality parameter of the karst spring, can be diverse (RISTIĆ, 2007;RISTIĆ et al., 2012a). As a result of catchment response, spring discharge hydrographs can be very steep and of short duration or, conversely, their receding limb can exhibit a very mild slope . Between these two extremes, there is a large number of sub-variants and combinations. The same applies to plots of karst groundwater quality parameters as a function of time.
Apart from monitoring quantitative parameters, which are among the major components of the karst aquifer regime, it is also necessary, if possible, to establish monitoring of physical properties (temperature, color, odor, taste, etc.); chemical, bacterial and gas compositions; radioactivity; and the like. Sampling of karst groundwater is not an easy task because it is not readily accessible as it tends to occur in areas difficult to reach. Also, the groundwater samples required for analyses need to be transported to an accredited laboratory. In addition, the entire undertaking, from the field trip to the lab work, is rather costly. As a result, karst groundwater tends to be sampled once a month or even once a season. However, sudden changes in certain quality parameters require daily or even more frequent observation (every 12 hours, 6 hours, or continuously). Monitoring of a certain quality parameter allows for the generation of sufficiently long timeseries, such that autocorrelation and cross-correlation analyses can be used for assessment purposes, as well as AR, CR and ARCR models for simulations. The presented models are developed by N. KREŠIĆ and the methodology used in this paper is presented in details in several articles (KREŠIĆ, 1994(KREŠIĆ, , 1997(KREŠIĆ, , 2007.
Кључне речи: аутокорелација, кроскорелација, ауторегресиони модел, кросрегресиони модел, аутокросрегерсиони модел, нитрати, врело Бање, Србија. value tends to be affected by the value recorded yesterday, two days ago, three days ago, etc. Autocorrelation analysis is used to define this type of correlation and effect -to analyze the influence of a variable on itself. Autocorrelation is the effect of a random variable X (in the specific case karst spring discharge, spring pool water level, groundwater level, turbidity, nitrate concentration, total bacteria, etc.) on itself at a time step of 1, 2, 3, 4, ... n. The most often used time step is one day, but it can also be expressed in weeks, months, etc. The strength of correlation between time-series is expressed via the correlation coefficient r, derived from the equation (KREŠIĆ & STEVANOVIĆ, 2010): where: n is the number of data points, x aν is the average value of the sample, and x i is the value of the random variable at time t=i+k. The resulting autocorrelation coefficients for different time steps, as a function of the time steps, constitute an autocorrelogram.

Cross-correlation analysis
With regard to the effect of precipitation, it should be noted that a raindrop that reaches karst and carries certain chemical information received in the atmosphere, requires some time to reach the aquifer and then, as part of the aquifer, to travel down privileged pathways to the spring (RISTIĆ VAKANJAC, 2015). Along the way, it interacts with the ground surface and the geologic setting, and changes its own chemical composition. It also entrains certain deposited components that lay dormant in karst conduits and caverns during a dry period. As such, the quality of karst groundwater varies and, if we exclude the anthropogenic impact and excessive accidental pollution, largely depends on the precipitation regime.
Based on the above, it is safe to say that spring flow (or any other component of the karst spring regime) is more or less influenced by precipitation (or any other water input), and this influence can be delayed for a variety of reasons (KREŠIĆ & STEVANOVIĆ, 2010). Cross-correlation analysis is used to assess the effect of mutually dependent variables at different time steps. The cross-correlation coefficient for any time step can be obtained from (PROHASKA, 2006;KREŠIĆ & STEVANOVIĆ, 2010): where: COV is the covariance between two time-series, x i is the independent variable (in the specific case the amount of daily precipitation), y i is the dependent variable -time-series of the daily average values of the analyzed quality or quantity parameter, VAR(x i ) and are the variances of the two time-series. The covariance is calculated applying the equation and the variances of the time series from The cross-correlation coefficients as a function of different time steps constitute a cross-correlogram.

Autoregressive (AR), cross-regressive (CR) and autoregressive-cross-regressive models (ARCR)
Autoregressive (AR) models can be used to simulate (estimate) karst spring discharges or concentrations of a quality parameter, where X t is the predicted parameter at time t and the independent variables X t-1 , X t-2 , … X t-k stand for the analyzed parameter 1, 2, ... k days earlier. The equation is as follows (KREŠIĆ & STEVANOVIĆ, 2010;: Cross-regressive (CR) modeling can be used for the same purpose, based on the same principle, where the dependent variable X t also represents the concentration of a quality parameter at time t. Independent variables Y t-1 , Y t-2 , ... Y t-n denote precipitation 1, 2, ... days earlier or the concentration of the considered parameter in precipitation. The equation is (KREŠIĆ & STEVANOVIĆ, 2010;: where a, b 1 , b 2 ... b n are also model parameters.
And, finally, the above two models combined are the most frequently used ARCR model for simulations and short-term forecasts (AR + CR = ARCR), whose general equation is (KREŠIĆ & STEVANOVIĆ, 2010;: .. c n are also model parameters.

Study area -karst spring "Banja" near Valjevo
Population growth and industrial and agricultural development are having an increasing impact on the entire ecosystem. The quality of air, water and soil is changing in the negative direction and the biocenosis is under increasing stress. Nitrogen compounds, which are generally a result of the chemical industry, application of fertilizers, exhaust gases, and the like, have been classified as polluters extremely harmful to human health (RISTIĆ VAKANJAC et al., 2012). On the other hand, frequent bacterial contamination of water, due to natural processes but mostly as a result of human activity (e.g. illegal dumps), can also have a major impact on public health.
With this in mind, the catchment of the Banja Spring (source of the Banja River) was selected in 1990 as a test site to monitor qualitative parameters and assess the effect of precipitation on them. There were multiple reasons, but the most important were: 1. the immediate vicinity of the Petnica Research Station (PRS), where the necessary analyses were performed, as well as easy access to the spring (PRS is located about 500 m from the Banja Spring - Fig. 1); 2. the proximity to an urban agglomeration -the city of Valjevo (7 km as the crow flies), which is an industrial center and has a population of 60,000 within the city fabric according to the 2014 census, and more than 90,000 including the extended area; and 3. crop farming, rural households and a relatively small livestock farm within the spring catchment (Figs. 2 and 3).
In January 1991, a dam and staff gauge were installed at the Banja Spring for water level tracking, as well as a rain gauge at PRS for precipitation monitoring in the catchment. Sampling for hydrochemical analyses of the spring water was conducted daily. When there was precipitation, hydrochemical analyses also included meteoric water. As a result, the spring discharge and quality parameters (flow rate, temperature, pH, FTU, electrical conductivity, suspended sediment, Ca, Mg, Na, K, Cl, HCO 3 , SO 4 , NO 3 , and total bacteria) were monitored from 1991 to 2000. In parallel, precipitation chemism was also observed, including (apart from precipitation depth) pH, electrical conductivity, Ca, Mg, HCO 3 , SO 4 , NO 3 , and PO 4 .
The Banja Spring catchment is located in western Serbia, 7 km from the city of Valjevo (to the east-southeast) and in the immediate vicinity of the Petnica Research Station (Fig. 4). The elevations of the catchment are from 181 m (point of spring discharge) to Geol. an. Balk. poluos., 2018, 79 (1), 71-81.   slightly more than 600 m above sea level. Most of the catchment is comprised of massive Middle Triassic limestones, partly covered by Miocene sediments composed of marls and red bituminous schists (Fig. 4).
The above-ground morphological features of the Triassic limestones include numerous sinkholes (12 per km 2 on average), which are mostly found in series. Among underground features, there are cavers and fractures; the Petnica Cave is the most important feature and the place where the Banja Spring emerges (Fig. 1).
There are two surface streams in the Banja Spring catchment: the Zlatar and the Bukovik. The Zlatar is formed in Miocene sediments and sinks in a ponor, at Pećurine, when it enters a karst substrate. The Bukovik rises in Triassic limestones, in an area where karstification is low (no sinkholes), Fig. 1b. Its lower course is in a highly-karstified part of the limestones, where in follows the linear occurrence of sinkholes, slowly depletes and ultimately dries out. PRS has con-ducted tracing tests on several occasions and confirmed that there was a connection between the two streams and the Banja Spring (GOLUBOVIĆ et al., 2014;.

Assessment of nitrates
Animal wastes from livestock farms, fertilizers, pesticides, and herbicides, as well as the lack of sewerage system for wastewater, significantly influences the increased amount of rural pollutants in groundwater, including nitrates which belong to this group of pollutants. Waters of improper chemical constitution, used for irrigation of agricultural soil (deposits) covering karst aquifers, could also be placed among rural pollutants when leakage is present (KREŠIĆ et al. 1992).
Nitrates were certainly one of the most interesting parameters of the Banja Spring water quality, which were monitored in both karst groundwater and precipitation (Figs. 5 and 6). Before the monitoring at the Banja spring was initiated in 1991, sampling had been sporadically carried out during the period from 1976 to 1988 in order to determine concentration of nitrates in karst waters. According to the available information, the maximum concentration of nitrates in the karst spring Banja varied within the interval from 6 mg/dm 3 to 27 mg/dm 3 during this period (KREŠIĆ et al., 1989;PAPIĆ et al., 1991). Therefore, the maximum concentration of nitrates in the karst spring Banja increased about 4.5 times during the twelve years period.
Over the 10-year monitoring period (1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000), the perennial average nitrate concentration was 20.1 mg/dm 3 and the annual averages were in the interval from 15.1 mg/dm 3 (1994) to 27 mg/dm 3 (1998). On a daily basis, the lowest concentration was recorded on October 1 st 1994 (3.4 mg/dm 3 ) and the highest (46.9 mg/dm 3 ) was on July 9 th 1999 (RISTIĆ VAKANJAC et al., 2013). The perennial average nitrate concentration in precipitation was 7.7 mg/dm 3 and on an annual basis from 5.06 mg/dm 3 (1993) to 12.8 mg/dm 3 (1991). The highest nitrate concentration in precipitation, of 65.6 mg/dm 3 , was recorded on June 1 st 1998 and the lowest (0.05 mg/dm 3 ) was on May 5 th of the same year (1998) (RISTIĆ VAKANJAC et al., 2013). It should be noted that the highest nitrate concentrations in precipitation are associated with the first day of a rainfall episode and that they increase with the length of time between two episodes. For example, a sample collected on the first day of a rainfall episode (when the interval between two episodes was one day or more (t ≥ 1)) and a sample taken after there was no precipitation for a minimum of five days (t ≥ 5) revealed that the nitrate concentration was about 18% higher in the case of t ≥ 5, compared to t ≥ 1 (RISTIĆ VAKANJAC et al., 2013). The situation was similar with standard deviation, which was about 17% higher in the specific case.
To simulate nitrate concentrations in the karst spring pool, first autocorrelation and cross-correlation analyses of the concentrations were undertaken, for the entire period (10 years) and for each year, to assess the occurrence of nitrates at the karst spring during dry and wet years.
The simulations resulted in an autocorrelogram of the nitrate concentrations of the Banja Spring for the entire period, as shown in Fig. 7. The autocorrelograms for the selected wet and dry year are shown in Fig. 8. It follows from these figures that the autocorrelation coefficient declines with increasing time lag. With regard to the entire period, the sample was extremely autocorrelated. It is apparent (in Fig. 7) that the autocorrelation coefficient was greater than 0.3 even after 100 days. The situation was similar in the selected cases (dry and wet year). In the dry year, the autocorrelogram declined slowly and the autocorrelation coefficient dropped to below 0.2 only after 40 days (i.e. the memory of the system was up to 40 days). In the wet year, the autocorrelogram was steeper and the memory up to 12 days. More precisely, after 12 days the nitrate ion concentrations in the Banja Spring pool become independent variables (there was no persistence and the time series had no memory) (KREŠIĆ & STEVANOVIĆ, 2010).
With regard to cross-correlation, the effect of precipitation totals on nitrate concentrations in the spring pool were analyzed for the entire period (Fig. 9) and separately for the wet and the dry year (Fig. 10). In addition, the effect of nitrate concentrations in precipitation on the nitrate concentrations in the spring pool were analyzed following the same principle -the cross-correlogram for the entire period is shown in Fig. 11 and for the wet and the dry year in Fig. 12.
The correlation coefficient in both cross-correlation analyses of the entire study period was extremely low (less than 0.1), which indicated that neither precipitation nor nitrate concentrations in precipitation had a significant effect on the occurrence of nitrates at the         spring. The correlation was somewhat stronger when each year as analyzed separately, especially in the case of wet years, although the correlation coefficients still remained below the threshold value of 0.2. The reason for such cross-correlograms is certainly nitrate pollution within the Banja Spring catchment, as a result of application of fertilizers, the livestock farm, and rural households with no access to public sanitation.

Simulation of nitrate concentration in the Banja Spring pool
All three models: AR, CR and combined ARCR, were used to simulate nitrate concentrations in the water of the Banja Spring. The resulting values showed that the correlation coefficients obtained with the AR model for nitrates in the karst spring pool were much higher than those of the CR model (Figs. 13, 14  and 15). The AR model coefficients based on data for the entire study period were from 0.866 (order 1) to 0.882 (order 15) and were the result of a long memory (Fig. 7). Conversely, the CR model coefficients were extremely low: -from 0.004 (order 1) to 0.036 (order 15) -simulation of nitrate concentrations in the spring pool as a function of daily precipitation totals over the entire study period, and -from 0.065 (order 1) to 0.183 (order 15) -simulation of nitrate concentrations in the spring pool as a function of nitrate concentrations in precipitation over the entire study period.
The application of these models on an annual basis and to one wet and one dry year provided slightly better results. The AR model correlation coefficients were still high but the simulations were better for dry years (higher correlation coefficients). Relative to the entire study period, the CR model yielded much high-er correlation coefficients for both the dry and the wet year, but it should be noted that in this case the simulations were better for the wet years (higher correlation coefficients, Figs 14 and 15).
In general, the combined ARCR model should have produced the best results. However, because the simulation results of the AR model were good and those of the CR model much poorer (for the entire study period), the ARCR model correlation coefficients were roughly equal to those of the AR model.    -from 0.734 (model order 1) to 0.756 (model order 7) for the wet year. Figure 16 is a parallel representation of measured and calculated nitrate concentrations for model order 1.

Conclusion
Assessments of the regimes of quantitative and qualitative parameters of karst groundwater provide better insight into the karst system itself, types of hydrogeological structures, method of recharge, dependence on precipitation, regime of the analyzed parameter, etc. For any kind of regime assessment aimed at defining an appropriate simulation and/or short-term forecasting model, the first requirement is a sufficiently long time-series. Such an assessment was enabled by ten-year monitoring of the qualitative and quantitative parameters of karst groundwater that emerges at the Banja Spring near Petnica (Valjevo), and of precipitation by means of a rain-gauge station installed at the Petnica Research Station. The assessment was conducted for the entire period of monitoring (1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000), as well as for one dry and one rainy year. The correlation coefficients in the autocorrelation analysis of nitrate concentrations in the spring pool were found to decrease with increasing time lag in all three cases. The memory of the system was very long -more than 100 days -for the entire period. The autocorrelogram of the dry year exhibited a slower decline than that of the wet year. The memory in the case of the dry year was about three times longer than that of the wet year. The strong autocorrelation of the time-series indicated that there was a strong inter-dependency of the random variable (in the specific case nitrate ion concentration), resulting in a weak dependency in cross-correlation analyses (precipitation -nitrate concentration in the spring pool or nitrate concentration in precipitation -nitrate concentration in the spring pool). The undertaken cross-correlation analyses corroborated this. The correlation coefficients in the cross-correlation analyses were extremely low (less than 0.1), meaning that neither precipitation nor nitrate concentrations in precipitation had a significant effect on nitrate concentrations in the spring pool. When each year was studied separately, the resulting correlation was somewhat stronger, especially in the wet year, although the correlation coefficients remained below the threshold value of 0.2. The reason for the low cross-correlation coefficients was definitely nitrate pollution in the Banja Spring catchment as a result of application of crop fertilizers and the presence of a livestock farm and rural households with no access to public sanitation.
In view of all the above and based on the results of the auto-and cross-correlation analyses, AR models were expected to produce much better results than CR models. The following models were used to simulate nitrate concentrations in the Banja Spring pool: autoregressive -AR, cross-regressive -CR, and combined ARCR models. The correlation coefficients derived from the AR and CR models showed that in the case of nitrates, the AR model yielded much higher correlation coefficients than the CR model (Figs. 13, 14  and 15). The AR model coefficients based on the entire study period were from 0.866 (with one 1 st order random variable) to 0.882 (equation of the 15 th order -15 random variables), and were a result of a long memory (Fig. 7). Conversely, the CR model coefficients were extremely low: -from 0.004 (1 st order eq.) to 0.036 (15 th order eq.) -simulation of nitrate concentrations in the spring pool as a function of daily precipitation over the entire study period, and -from 0.065 (1 st order eq.) to 0.183 (15 th order eq.) -simulation of nitrate concentrations in the spring pool as a function of nitrate concentrations in precipitation over the entire study period.
The situation was somewhat better when the models were applied on an annual basis and to one dry and one wet year. The correlation coefficients derived from the AR model were still high but the simulation was better (higher correlation coefficients) for dry years. Relative to the entire study period, the CR models yielded much higher correlation coefficients for both the dry and the wet year, although it should be kept in mind that the simulations were better (the correlation coefficients were higher) for wet years (Figs. 14 and 15).
The combined model -ARCR should generally have yielded the best results. However, because the AR model produced good results and the CR model much poorer results (for the entire study period), the correlation coefficients of the ARCR model were roughly equal to those of the AR model. For example, in the case of nitrate concentrations in the spring pool as a function of nitrate concentrations in the spring pool on previous days and in precipitation on previous days, the ARCR model correlation coefficients were: -from 0.866 (1 st order eq.) to 0.881 (7 th order eq.) for the entire study period, -from 0.762 (1 st order eq.) to 0.804 (7 th order eq.) for the dry year, and -from 0.734 (1 st order eq.) to 0.756 (7 th order eq.) for the wet year.