Dosimetric verification of clinical radiotherapy treatment planning system

Background/Aim. The aim of the study was investigating the significant difference in: a) the dosimetric calculation of the radiotherapy treatment planning system (TPS) in relation to the values obtained by measuring on the linear accelerator (Linac), b) the accuracy of the dosimetric calculation between the calculating algorithms Anisotropic Algorithm (AAA) and AcurosXB in various tissues and photon beam energies. Methods. For End-to-End test we used the heterogeneous phantom CIRS Thorax002LFC, which anatomically represents the human torso with set of inserts known relative electron density (RED) for obtaining a CT calibration curve, comparable to the “reference” CIRS 062M phantom. For the AAA and AcurosXB algorithms and for 6 MV and 16 MV photon beams in the TPS Varian Eclipse 13.6, four 3D conformal (3DCRT), and one intensity modulated (IMRT) and volumetric modulated arc (VMAT) radiotherapy plans were made. Measurements of the absolute dose in the Thorax phantom, by PTW-Semiflex ionization chamber, were carried out on three Varian-DHX Linacs. Results . The difference between "reference" and measured CT conversion curves in the bone area is 3 %. For 476 phantom measurements, the difference between measured and TPS calculated dose of (3-6) %, we had in 30 (6.3 %) cases. According to regression analysis, the standardized Beta coefficient for relative errors, 6 MV vs 16 MV, was 0.337 (33.7 %, p < 0.001). Mean relative errors for AAA vs AcurosXB, using Mann-Whitney test, for bones were 1.56 % and 2.64 % (p = 0.004). Conclusion. The End-to-End test on Thorax002LFC phantom proved the accuracy of TPS dose calculation in relation to the one delivered to the patient by Linac. There is a significant difference for photon energies relative errors (higher values are obtained for 16 MV vs 6 MV). A statistically significant minor relative error in AAA vs. AcurosXB was found for the bone. test, heterogeneous phantom, calculating algorithms.


Introduction
Modern radiotherapy (RT) undoubtedly represents the technologically most complex branch of medicine today. In the treatment of malignant diseases, as a cure we use ionizing radiation directed towards the volume in which the tumor cells are located, in order to permanently destroy them with the maximum possible protection of the surrounding healthy tissue.
In the past two decades, with the development of information technology, we have witnessed the emergence of new ones: radiation therapy techniques, radiotherapy treatment planning systems (TPS) with calculating algorithms for the dosage calculation in a patient, units for multisliced computed tomography (CT) and image-guided treatment delivery, which enables better and more precise treatment for patients.
Based on the data set previously measured on the Linac and CT simulator, TPS calculates three-dimensional (3D) dose distribution in the patient. Unfortunately, many cases of incorrect data imports and usage of TPS were published, which also led to accidents with lethal outcome 1,2 .
Namely, 28 % of accidents in RT are due to the wrong TPS dose calculations, caused by: poor knowledge of TPS, incorrect data entered in TPS, and lack of TPS calculation quality assurance (QA TPS) 3 . International recommendations are that the delivered dose of radiation in the patient is no more than 5 % different than prescribed, and the incidence of TPS calculation errors is less than (3)(4)  Therefore, the implementation of the QA-TPS procedure (such as the "End-to-End" test) for TPS in RT is crucial for reducing the number of accidents. There are several studies that helped develop guidelines and protocols for linear accelerators (Linac) based QA TPS for 3D Conformal Radiotherapy (3DCRT) [5][6][7][8] and Intensity Modulated Radiotherapy (IMRT) 9,10 depending on the calculation algorithm used in TPS 11,12 . Nowadays, in addition to 3DCRT and IMRT radiation techniques, volumetric modulated arc therapy (VMAT) is also used in routine practice.
It is clear that preparation and implementation of an "End-to-End" test is of great importance, which is used to control the overall precision of the entire RT chain. It is made up of a set of practical tests conducted on a heterogeneous phantom. In general, an "End-to-End" test consists of: a) recording a calibration curve on a CT simulator and comparing it with a reference (entered into TPS), b) creating characteristic RT plans of all RT techniques, energies of photon beams and calculating algorithms, c) irradiation prepared plans on Linac and measuring doses in defined phantom positions (type of tissue).
Based on the "End-to-End" test, we have launched a dosimetric study to investigate: a) whether there is a significant difference in the dosimetric calculation of TPS (for: 3DCRT, IMRT and VMAT radiation techniques) in relation to the value obtained by Linac measuring in the phantom, b) whether there is a significant difference in the accuracy of the dosimetric calculation between the calculation algorithms Analytical Anisotropic Algorithm (AAA) and AcurosXB, depending on the type of tissue in which the dose is applied and photon beam energies.

Material and Methods
Under the same, standardized, methodological principles, this study investigated the influence of various RT factors: radiation techniques, photon beam energy, calculation algorithm and tissue types, in regards to the TPS calculated dose.
Dosimetric tests cover all techniques of external beam radiotherapy (EBRT) and anatomical structures are similar to those encountered when working with patients.
All-round testing was carried out at the same facility in a relatively short period of time by engaging a same professional team, which generally implies repeatability and accuracy of the measurement.

Phantom
In all segments of this study, was used the heterogeneous phantom CIRS Thorax002LFC (Computerized Imaging Reference Systems Inc., Norfolk, Virginia), which anatomically represents the average human torso (30 cm long, 30 cm wide and 20 cm thick). It is made of plastic water, lungs (density 0.21 g/cm 3 ) and bone-spinal cord (1.6 g/cm 3 ), with 10 cylindrical inserts where the ionization chamber can be placed ( Figure 1) and the dose measured at the particular place. The phantom also has a set of inserts (muscle, bone, lung and adipose equivalent tissue) of known relative electron densities (RED) 13 .

Scanning the Phantom on a CT simulator
The Thorax002LFC phantom was scanned on a sixteen slice CT simulator LightSpeed (GE, Boston, Massachusetts) gantry wide bore 80 cm diameter, at a voltage in the X ray tube of 120 kV (thorax protocol). First it was scanned with inserts of known electron density, in order to obtain the CT calibration curve that is the ratio between RED and Hounsfield units (HU). The materials used are in the range of -1000 for air, 0 for water and 1000 HU for materials that simulate the bone. The obtained curve is compared with the "reference" curve in TPS, which was created by scanning the CIRS 062M phantom (25 cm long, 33 cm wide and 27 cm thick) that possesses 16 inserts with a known RED under the same conditions of the CT simulator. Acceptable difference RED for the same HU value, between curves, is ±0.02 (i.e. ±20 HU for the same RED value, except for water ±5 HU) 4 .
The second time, the Thorax002LFC phantom is scanned (thorax protocol) with the corresponding cylindrical tissue inserts (Figure 1), for the making of a set of RT plans in the TPS.

The creation of clinical RT plans for dosimetric measurements
For study purposes, in the EBRT radiotherapy planning system Varian Eclipse 13.6 (Varian, Medical Systems, Palo Alto, California), six RT plans were made, four 3DCRT 5 , one IMRT and VMAT 10 . All plans were made for two photon energies 6 MV and 16 MV, as well as for two calculating algorithms: AAA and AcurosXB. This way, the isodose distribution in the phantom was obtained, i.e. we got the absolute dose in different tissues (measuring points).
The beams geometry and the isodose distribution as well as the position of the measuring points of the 3DCRT plans are shown in Figure 2, while the detailed parameters of the plans are given in Table 1.    Table 2.

Measurements on Linacs
The measurements were carried out on three Varian DHX Linacs ( Data are presented as arithmetic mean value with standard deviation (SD) or confidence interval (CI). The Kolmogorov-Smirnov test was applied to assess the normality of the studied continuous data.
Strength of the association between independent factors (accelerators, algorithms, tissues, photon energies, tests) and relative error data (dependent factor), was determined by using univariate and multiple linear regression analysis. Further detailed assessment was carried out using GLM univariate ANOVA (post hoc Bonferroni test) and Mann-Whitney U tests.
All the analyses were estimated at minimal p < 0.05 level of statistical significance.
Complete statistical analysis of data was done with the statistical software package, SPSS Statistics 18 (USA).

CT to RED conversion
By measuring HU values for known RED values, we obtained the CT conversion curve for the CIRS Thorax002LFC phantom. The obtained curve was compared with the "reference" (TPS) curve, where the difference in the area of large electronic densities is seen (Figure 4), while in the lower density region, the match is within the allowed values. The RED values for bones (829 HU) differ by 3 % while the difference in HU (RED 1.51) is 10 % (Figure 4).

Results of clinical test cases
The differences between measured and TPS calculated doses at different measuring points (tissues) and RT plans (case 1 -6), with values of tolerances (agreement criteria) measured on three Linacs are presented in Figures 5 and 6. The results are grouped by calculating algorithms and photon beam energies.  As Kolmogorov-Smirnov test revealed non normal distribution of relative errors, some data transformation was necessary.
Firstly, negative sign marks obtained in any point, were corrected by adding corresponding fix value to all data. This way, all relative errors have become positive. In the second part, these data were further transformed by applying log 10 (X) transformation and used in all presented analysis.
Using the univariate and multivariate regression analysis, the effect of independent (explanatory) variables on the relative errors X (%), was examined (Table 3). Using the univariate analysis of variance (GLM model, ANOVA), we examined the main effects of independent predictors on the relative error X (%) ( Table 4). Because of the large number of potential interactions of independent variables (total of 26), their effects on the measured results were not shown. For independent predictors, in which a statistically significant effect was found for the relative errors (deviations), the significance of the differences between the mean values of the relative errors of certain categories was investigated, using the Bonferroni test (steam comparisons). The overview of this analysis is given in Table 5. In addition, we investigated the magnitude of the mean value of the relative errors, depending on the calculation algorithms and tissue types (Table 6) with the Mann-Whitney U test.

Discussion
Based on the comparison of the "reference" and measured conversion curves, we established a difference in the area of higher electronic densities (RED values for bones vary by 3 %), while in lower density areas, the match is within the allowed values ( Figure   4). However, it is estimated that difference of 8 % in bone relative electron density affects dose TPS calculation accuracy less than 1 % 16 .
Out of a total of 476 measuring points, the deviation between TPS calculated and measured doses of (3-6) % was obtained in 30 measuring points (6.3 %) (Figures 5, 6).
The measured dose is in 188 cases (79 %) higher than TPS calculated for AcurosXB, while in the case of AAA the same is noticed in 165 (69.3 %) cases.
Depending on the tissue type, the measured dose in bone is in the 88.6% of the cases higher than the calculated, for the lungs in 76.3 % and soft tissue in 70.7 %.
When the bone tissue is analyzed independently, the AcurosXB leads in 95.5 % of points to the increased measured dose in relation to the calculated (81.8 % in the case of the application of the AAA algorithm).
Based on the univariate and multivariate regression analysis, we can notice a significant influence of calculating algorithms, tissue type, photon beam energy and test type (Case 1-6) on the relative error (deviation) in both models (Table 3). This data indicate that these variables are significant independent predictors with an influence on the size of the relative error. Depending on the Linacs, there is no significant effect on the size of the relative error. Based on the value of the standardized Beta coefficient (  15 and Knoos et al. 18 ), in bone tissue compared to soft tissue and lungs, in tests-cases 1 and 2 (compared to others cases) and in the application of the calculation algorithm AcurosXB vs. AAA.
Using the univariant analysis of variance (GLM model, ANOVA), this study confirmed significant effects on the relative error (previously obtained by univariant and multivariate regression analysis), depending on the applied calculation algorithm, type of tissue, photon beam energy and type of test (Table 4).
If we focus on the specific research objectives of this study, the supplementary (post hoc) analysis (Bonferoni test, AcurosXB has a smaller relative error).
The design of the study also caused the appearance of certain weaknesses primarily in the statistical part of the examination. In the case of simultaneous examination of multiple independent variables (multiple regression analysis, GLM univariate ANOVA with multiple independent variables), ideally the highest reliability is obtained when the number of samples in each group is approximately the same. Phantom characteristics (the unequal number of measuring points relative to the type of tissue) significantly contributed to this problem.
The selected statistical methods due to their robustness and reliability, but also the fact that different statistical techniques confirm the results of the test, indicate to a large extent the correctness of our conclusions.

Conclusion
The performed End-to-End test on the heterogeneous phantom CIRS Thorax002LFC gives us a confirmation of the correct TPS dose calculation (for all EBRT techniques, photon beam energy, calculating algorithms and different types of tissue) and delivery to the patient on Linac, in our RT center daily clinical practice. The mentioned phantom in practice can be used for control, but not for obtaining a reference calibration curve. The analysis of the results showed that there is no statistically significant difference between the Linac, but there is between photon energies (greater relative errors can be expected when using 16 MV compared to 6 MV). In addition to the calculation algorithms (AAA vs AcurosXB), there were no significant differences in soft tissue and lung relative errors, but for the bone there is difference in favor of AAA.