Translate this page into:
Clinical applicability of setup margins in radiotherapy based on non-gaussian approach: A multiple-site comparison
*Corresponding author: Dr. Ganeshkumar Patel, Department of Radiotherapy and Radiation Medicine, Institute of Medical Sciences, Banaras Hindu University, Varanasi, Uttar Pradesh, India. gpatel.radio@bhu.ac.in
-
Received: ,
Accepted: ,
How to cite this article: Mondal K, Muskan, Mandal A, Vijay A, Yadav S, Dutta S, et al. Clinical applicability of setup margins in radiotherapy based on non-gaussian approach: A multiple-site comparison. Asian J Oncol. 2025;11:25. doi: 10.25259/ASJO_21_2025
Abstract
Objectives:
Accurate setup margin (SM) estimation is critical in radiotherapy to ensure target coverage while sparing organs at risk. Traditional methods like van Herk’s formalism assume that setup errors follow a normal or Gaussian distribution, which may not hold for small or skewed datasets and may result in under- or over-estimated SMs.This study evaluates an unconventional method based on the non-Gaussian percentile approach against the van Herk formalism for SM estimation across multiple tumor sites using electronic portal imaging device (EPID) setup data.
Material and Methods:
Eighty patients (20 per site: brain, head and neck, thorax, pelvis) treated with external beam radiotherapy were analyzed retrospectively. Setup errors were measured via EPID-based imaging. Conventional SMs were calculated using van Herk’s method (2.5Σ + 0.7σ), while the unconventional method derived SMs directly from the setup-error histogram, i.e., the 5th–95th percentiles (90% range of setup errors, RSE) of per-patient mean daily shifts.Normality was assessed using Shapiro-Wilk(S-W) tests, skewness/kurtosis analysis, and Q-Q plots. Methods were compared using the Wilcoxon signed-rank test (p<0.05) and effect sizes (Cohen’s d)reported without claiming clinical significance.
Results:
Unconventional SMs were consistently larger than conventional SMs (overall medians: 4.21 vs 2.71 mm; p<0.05), with the largest differences in thorax (Δ up to 3.63 mm) and pelvis (Δ up to 2.17 mm). Several datasets were non-normal by S–W and/or skewness/kurtosis. As a practical benchmark, margin differences ≥2 mm were considered clinically meaningful for interpretation, given typical planning target volume (PTV) rounding and published cone-beam computed tomography (CBCT)-era margins.
Conclusion:
In non-Gaussian or limited-data scenarios, the percentile-based method yields more conservative SMs than van Herk. Given EPID’s inability to capture organ motion/rotation, validation with CBCT or 4D imaging is recommended before clinical implementation.
Keywords
Cancer radiotherapy
Image-guided radiotherapy
Normal distribution
Radiotherapy setup margin
Van Herk’s formalism
INTRODUCTION
Accuracy and precision in modern radiotherapy, utilizing intensity modulation, stereotactic techniques, and image guidance, play a crucial role in achieving optimal therapeutic benefit while minimizing normal tissue toxicity due to radiation. This is achieved by delivering an adequate radiation dose to the target (delineated tumor) and sparing organs at risk (OARs).[1,2] Geometric uncertainties, including the patient setup deviation, are important factors for accuracy and precision in radiotherapy.[3] Uncertainty due to setup deviation can be managed with a setup margin (SM) by expanding the clinical target volume (CTV) to a planning target volume (PTV). SM accounts for systematic and random setup uncertainties and has a crucial role, as it may influence treatment outcomes.[3] An improperly estimated setup margin (SM) can compromise treatment efficacy by either under-dosing the target or exposing healthy tissues to excessive radiation, leading to adverse side effects.[4] Several margin estimation formalisms are available, such as those proposed by International Commission on Radiation Units and Measurements (ICRU) 62[5], Stroom et al.[6], and van Herk et al.[7], which have been traditionally used in clinics for decades, and among them,van Herk’s method is universally accepted in clinical practice, mainly due to its simplicity.[8] Van Herk et al. derived a “margin recipe” to ensure minimum dose coverage of the CTV for a specified percentage of the patient population (e.g., 95% dose coverage for 90% of patients) based on the dose-population histogram and the probability of correct target dose delivery.[7] Although van Herk’s method was established as a cornerstone in estimating setup margins in radiotherapy, several modifications or alternatives have also been proposed over the years using different approaches, including factors like the number of beams, tissue density, target size, non-Gaussian distribution of setup errors data, optimal statistical model approach, and even a radiobiological approach.[8] The van Herk’s method assumes that setup errors consisting of systematic (Σ) and random (σ) error components follow a normal distribution[9] and may be suitable for large datasets.[7] While this method may work well for large sample sizes, it can yield inaccurate results for smaller datasets or data that is not normally distributed.[10] To address this limitation, Suzuki et al.[10] proposed an anon-Gaussian approach that derives SMs directly from error histograms, without assuming normality. However, their work was limited to prostate cancer cases, leaving the utility of the technique for other treatment sites unexplored. Additionally, their results were not compared with those of the conventional methods traditionally used in clinics. As the issue of small sample size or non-normality of setup data may be relevant, particularly for smaller clinics or early-phase clinical implementations with limited patient throughput,[11] such an unconventional method of SM estimation that does not assume normality should be investigated further to validate or discard its clinical credibility. In routine clinical workflows, setup-error histograms can deviate from normality (skewness, heavy tails, outliers), particularly in small or early-implementation cohorts.[10] In such scenarios, a distribution-aware, empirical (percentile/histogram-based) margin may better preserve intended CTV coverage than recipes that assume normality.[12] Accordingly, we report an empirical, non-Gaussian estimate alongside van Herk to gauge its potential clinical value when normality is violated.
In this study, we aim to assess the applicability and appropriateness of the unconventional SM estimation method across four tumor sites: brain, head and neck (H and N), thorax, and pelvis. We estimate the setup margin using a non-Gaussian approach, which utilizes the concept of Suzuki et al.[10] And compare the results with those of van Herk’s method. This evaluation will help determine whether the unconventional method provides more reliable margin estimates, especially when patient datasets are small or error distributions deviate from normality. With respect to imaging selection, cone-beam computed tomography (CBCT) offers volumetric visualization and can characterize rotations.[13], but its routine, every-fraction use may be constrained by equipment availability, acquisition time, and workflow considerations[14,15]-particularly in resource-limited or early-implementation environments. By contrast, electronic portal imaging device (EPID) verification is widely available, rapidly deployable, and readily integrated into daily treatment for bony-surrogate alignment, enabling consistent setup error capture across fractions within existing resources.[16-18] We therefore use daily imaging using EPID because it is workflow-feasible in our setting. We acknowledge its limitations for soft-tissue visualization and rotations and interpret margins accordingly.
MATERIAL AND METHODS
The setup error data for eighty (80) patients treated with external beam radiotherapy (EBRT) between December 2021 and March 2022 were evaluated retrospectively in this study for setup margin (SM) analysis. Eighty patients were divided into four groups, having twenty patients in each, based on the four treatment sites: Brain (n = 20), Head and Neck (n = 20), Thorax (n = 20), and Pelvis (n = 20). Radiotherapy was done using either three-dimensional conformal radiotherapy (3DCRT) or volumetric arc therapy (VMAT) techniques and performed using a 6-MV UNIQUETM linear accelerator (Varian Medical Systems, Palo Alto, CA, USA) facilitated with an electronic portal imaging device (EPID). The EPID (AS1000, Portal Vision Advanced ImagingTM v.1.6) has an amorphous silicon-based flat panel detector with a matrix of 1024 × 768 pixels and an active imaging area of 30.1 × 40.1 cm2, which enables patient repositioning using a megavoltage (MV) imaging facility. All the patients underwent pre-treatment online setup verification.[3] No ethical permissions were required for this retrospective study, as it does not include any human participants, animals, or tissues.
CT Simulation, immobilization, and treatment planning
During simulation, patients were scanned using the computed tomography CT) machine Light Speed-VCT64TM (General Electric Medical Systems, Waukesha, Wisconsin) with a 3 mm slice thickness. Patients of all groups were scanned in the head-first supine position and immobilized with thermoplastic masks (Orfit Industries NV, Wijnegem, Belgium), mostly with vacuum cushions (Orfit Industries NV, Wijnegem, Belgium) for a few patients with cancer in the thorax and pelvis regions. A universal type carbon-fiber flat-top base plate (Orfit Industries NV, Wijnegem, Belgium) was used to fix the immobilization device during the CT simulation. For immobilization, Brain and H and N patients were secured using thermoplastic masks with 3 and 5-point fixation clamps, respectively, and with appropriate neck supports. Thorax and pelvis patients were immobilized using either vacuum cushions or 4-point thermoplastic masks with knee or ankle supports when necessary. The simulation was performed under the same treatment conditions (e.g., full bladder and empty rectum for pelvic cases) to ensure reproducibility during treatment delivery. After radiotherapy (RT) simulation, patients’ data were transferred to the treatment planning system (TPS) for RT planning. Treatments were planned using EclipseTM v.11.0.47 TPS (Varian Medical Systems, Palo Alto, CA, USA) and delivered using a 6-MV linear accelerator equipped with the EPID. The universal flat-top carbon-fiber base plate ensured consistent immobilization across imaging and treatment sessions.
Set up verification and imaging
Daily setup verification was performed using EPID, which acquires orthogonal images in the anterior-posterior and lateral directions. Deviations in three translational directions, i.e., lateral (R-L), superior-inferior (S-I), and anterior-posterior (A-P), were recorded. Portal images were automatically matched with digitally reconstructed radiographs (DRRs) from the planning CT, with manual adjustments made if needed. Deviations greater than 5 mm required patient repositioning and reimaging before treatment. For each patient, five fractions were analyzed, with the first two consecutive fractions imaged daily and subsequent fractions imaged weekly. A total of 800 portal images (10 images x 20 patients of each group of four sites) were registered across all treatment sites. The whole process of patient setup and imaging was done under the close observation of the medical physicist. Patient setup: first, patients were set on projections of wall-mounted lasers (in two laterals and one longitudinal direction) of the treatment room onto the fiducial marks given during simulation, and then shifted to the planning iso-center. After initial patient setup, two orthogonal portal images (PI) of setup fields in the anterior-posterior (AP) direction and lateral (LAT) direction were acquired using the EPID. Portal imaging was done by keeping the detector surface of EPID at 140 cm from the source and using the single exposure or double exposure feature (1-2 monitor units (MU)/field used). Image matching process: portal images (PI) were compared with corresponding digitally reconstructed radiographs (DRRs) generated from the planning CT. The automatic anatomy matching feature of Portal Vision software v1.6 (which is capable of calculating the difference between the intended and actual patient position in the three translational directions, such as lateral (R-L), longitudinal or superior-inferior (S-I), and vertical or anterior-posterior (A-P), specified by three Cartesian coordinates (X, Y, Z)) was used initially. For each patient, certain bony landmarks were chosen based on tumor location and contoured on DRR using a drawing tool feature; thereafter, matching was done by the radiation therapist.[3] Radiation oncologists performed manual adjustments to align bony landmarks for fine-tuning if needed. Setup Error Recording: the deviations in the patient’s position for three translational directions: X(R-L), Y(S-I), and Z(A-P) were recorded. The rotational deviation was not accounted for in this study. The PI of the AP setup field was taken first, matched with DRR, and deviations in the X and Y directions were noted down. Next, the PI of the LAT setup field was taken, and the same matching process was repeated, and then deviations in the Y and Z directions were noted down. Shifts or calculated deviation in X, Y (deviation considered for either AP setup field or for LAT setup field, which one is larger), and Z direction were considered as daily setup shifts (Δx, Δy, Δz) in the R-L, S-I, and A-P directions, respectively, for a particular patient. Finally, treatment couch adjustment was done accordingly by applying the auto-calculated couch shifts, and treatment was completed.Imaging modality and limitations. Daily setup verification utilized EPID orthogonal images for bony-surrogate alignment due to their availability and workflow feasibility in our program.[14,16,17] EPID does not visualize soft-tissue displacement nor quantify rotational errors; therefore, recorded setup errors represent translational components only. Organ motion, deformation, and rotations were not measured, which can bias Σ and σ downward in deformable or mobile sites (e.g., thorax, pelvis) and thus underestimate margins compared with CBCT/4D-based verification.[13,14,16,17] These constraints were considered when interpreting site-specific margins.
Set up margin (SM) calculation
Conventional method
Systematic and Random setup errors in three translational directions (R-L, S-I,and A-P) were calculated for each group of patients across the four treatment sites (brain, HandN, thorax, pelvis) using the following formulas, described in detail by Remeijer et al.[19]
This method of calculation was considered a conventional method of setup error calculation in this work.
Once the systematic and random errors were calculated, these values were used to estimate setup margins (SM) using the SM calculation recipe provided by van Herk et al.[7] As described by the given formula.
van Herk’s(2000):
2.5 Ʃ + 0.7σ
The van Herk’s method was considered and cited as the conventional method of setup margin estimation in this study, henceforth.The formula was applied separately for each translational axis (R-L, S-I, A-P) and each patient group. The resulting setup margins from the conventional method were compared with those from the unconventional method. However, it is important to note that systematic error (Ʃ) is a quadrature sum of delineation error, motion error, phantom transfer error, and systematic setup error, whereas random error(σ) is a quadrature sum of random setup error, motion error, and beam penumbra produced during treatment execution.Van Herk’s margin (CTV-PTV) formalism does not consider the penumbra, motion, planning parameter, or the algorithm error explicitly.[20] Moreover, portal images show combined effects of setup errors and phantom transfer error, but not the delineation error or organ motion error, because the organ is not visible in the portal image.[20] Therefore, in this electronic portal imaging-based analysis of geometric setup uncertainties, it may be justified that only the systematic setup error component is considered as systematic error (Ʃ) and the random setup error component as random error (σ) while applying van Herk’s margin formalism to calculate the setup margin.Hence, the exclusion of delineation error or organ motion error to calculate the conventional setup margins in this EPID-based study imposes a limitation because those were not measured separately and included in this current study.
Unconventional method
In this study, a statistical-geometrical approach was used to implement the unconventional method, based on the non-Gaussian distribution methodology proposed by Suzuki et al.[10] But with modifications to align with our clinical setting. Rather than separately calculating systematic and random errors, this method derives SMs directly from the distribution of setup errors observed in daily clinical practice.
Data preparation
For each patient, the mean daily shifts across all imaged fractions were calculated for each translational direction (RL, S-I, A-P). This resulted in a total of 12 datasets (3 axes × 4 tumor sites), three for each treatment site.
Normality testing
The Shapiro-Wilk (S-W) test was performed to check if each dataset followed a normal distribution.[21] Histogram plots and Quantile-Quantile (Q-Q) plots were obtained to visually assess deviation from normality.[21]
Histogram-based setup margin estimation
Followed by the normality tests, the unconventional SMs were calculated based on histogram analysis of data sets. The method does not rely on the normality assumption. Instead, an empirical, percentile-based margin was derived directly from observed setup error histograms.[10,12] For each direction (LR, SI, AP), the 5th (P5) and 95th (P95) percentiles were computed using linear interpolation. It uses the 90% range of setup errors (RSE) (i.e., 90% RSE P95 – P5), which includes data between the 5th and 95th percentiles of the distribution. The directional empirical margin was the full-width of 90% RSE, applied symmetrically about zero, and these were considered as the unconventional setup margin for each tumor site. For example, for the brain site, the setup error data for a particular direction were arranged in ascending order. The range between the 5th and 95th percentiles was determined and reported as the SM in that direction (i.e., R-L, S-I, A-P). This percentile-based approach ensures that extreme outliers do not distort the margin estimation, making it more suitable for situations with small datasets or non-normal distributions. The values of 90% RSE obtained by analyzing histograms of mean daily shifts in each direction (R-L, S-I, A-P) across the four sites were considered as unconventional setup margins (SM) in this work.We selected 90% rather than 95% (as in Suzuki et al.[10]) to (i) align with the 90% population-coverage principle embedded in the van Herk criterion and (ii) reduce the influence of extreme tails in smaller cohorts; detailed implications are discussed in the Discussion. Distributional shape was screened (Q–Q plots/Shapiro–Wilk) to document departures from normality; near-Gaussian data were summarized as mean ± SD, otherwise as median [IQR].[10,12]
Statistical analysis
All statistical analyzes were conducted using Jeffrey’s Amazing Statistic Program (JASP) v0.19 software package (Netherlands).[22] The normality of each setup error data set was checked quantitatively using the Shapiro-Wilk(S-W) test.[23] Skewness and kurtosis analyzes were also done in addition to test normality.[24] Histogram plots and Quantile-quantile (QQ) plots were used for qualitative validation of normality.[21] Finally, a non-parametric paired test: the Wilcoxon signed-rank test, which is generally recommended for small, non-Gaussian samples[25], was performed to compare conventional versus unconventional SMs. Medians[26] and the effect size (Cohen’s d) values were calculated to correlate with clinical significance (> 1.0).[27] Statistical significance was set at p-values < 0.05.
RESULTS
Conventional setup error and SM
The conventional setup errors for the population of patients were tabulated in [Table 1] (maximum values of systematic errors and random errors were highlighted). Maximum values of systematic errors were observed along the A-P direction for the brain case and along the S-I direction for all other sites. Similarly, maximum random errors were observed in the R-L direction for the pelvis case and along the S-I direction for other cases. Following the setup error calculation, setup margins (SM) were estimated using the conventional SM calculation formula given by van Herk’s methodology.[7] As mentioned in the materials and methods section, the results were tabulated in [Table 2] (maximum values highlighted). The setup margins observed across the four tumorsites were highest for the thorax cases and lowest for the brain cases [Table 2].
| Tumor site | Systematic error (Ʃ) | Random error (σ) | ||||
|---|---|---|---|---|---|---|
| R-L | S-I | A-P | R-L | S-I | A-P | |
| Brain | 0.17 | 0.38 | 0.43 | 1.12 | 2.51 | 2.23 |
| H and N | 0.17 | 0.37 | 0.32 | 1.21 | 2.79 | 2.37 |
| Thorax | 0.52 | 0.75 | 0.21 | 4.40 | 5.02 | 2.13 |
| Pelvis | 0.41 | 0.46 | 0.14 | 4.05 | 3.64 | 1.10 |
Systematic and random errors were calculated using the formulas for conventional setup error. Bold values indicate the maximum values of systematic and random errors observed for each tumor site across the three translational directions. H; Head, N: Neck
| Tumor site | van Herk’s(2000) | ||
|---|---|---|---|
| R-L | S-I | A-P | |
| Brain | 1.21 | 2.71 | 2.64 |
| H and N | 1.27 | 2.88 | 2.46 |
| Thorax | 4.38 | 5.39 | 2.02 |
| Pelvis | 3.86 | 3.70 | 1.12 |
Conventional setup margins were calculated according to van Herk’s formalism using the calculated values of systematic errors (Ʃ) and random errors (σ). Bold values indicate the maximum setup margins observed for each tumor site across the three translational directions. R-L: Right-Left (lateral direction), S-I: Superior-inferior (longitudinal direction), A-P: Anterior-posterior (transverse direction). H: Head, N: Neck.
Unconventional setup error and SM
Normality check
In this study, the setup data of the patient population wereanalyzedqualitatively and quantitatively to check normality. Tumorsite-wise, the means of daily shifts of setup across fractions for each patient in each direction (i.e., R-L, S-I, A-P) were considered for evaluation. To check the normality of the data qualitatively, histogram plots and corresponding Q-Q plots were evaluated for each site in each direction, as given in [Figure 1] (A. Brain, B. H and N,C. Thorax, D. Pelvis). By visual inspection of the histograms and Q-Q plots, it can be noted that the histograms are not symmetric and not at all bell-shaped, whereas the data points are mostly semi-linear in the Q-Q plots.

- Histogram plots and Q-Q plots of shifts (daily setup deviation) across three translational directions (Lateral, Longitudinal and Transverse) for Brain, Head and Neck, Thorax, and Pelvis sites: contains a combined plot for (A) Brain, (B) Head and Neck, (C) Thorax, (D) Pelvis sites showing (a) Histograms with Gaussian fit and corresponding (b) probability plots (Q-Q plots) for Lateral shifts (R-L), Longitudinal shifts (S-I), and Transverse shifts (A-P) respectively. Theoretical quantiles are plotted against sample quantiles in Q-Q plots to assess normality for each site in three translational directions. Large deviations from the diagonal line indicate non-normality. Note: Shifts: R-L (Lateral), S-I (Longitudinal), and A-P (Transverse) represent the three translational directions of setup adjustments. For each patient, the mean daily shifts across all fractions were calculated for each translational direction (R-L, S-I, A-P). This resulted in a total of 12 datasets (3 axes × 4 tumor sites), three for each treatment site. Hence, histograms and Q-Q plots of these 12 datasets were obtained. Gaussian Fit: the smooth curve fitted over the histogram represents the best-fitting normal distribution for the observed data. Theoretical Quantiles: Represent the expected values of a perfectly normal distribution. Sample Quantiles: Derived from the observed data. H:Head, N:Neck, Q-Q: Quantile-quantile
Brain case
Histogram distributions of R-L and A-P shifts are close to normal, with minor deviations. The S-I direction shows a slightly skewed distribution with more data concentrated around the mean.In Q-Q plots, the R-L and A-P shifts align well with the theoretical normal quantiles, indicating normality. The S-I direction shows mild deviations at the tails, suggesting a slight non-normal distribution.
H and N case
R-L and S-I histogram distributions are slightly skewed, with longer tails on one side. The A-P direction is closer to normal but with less variance.In Q-Q plots,the R-L and S-I directions show noticeable deviations in the tails, indicating non-normality. The A-P shifts align better with normality.
Thorax case
Histograms of all three directions (R-L, S-I, A-P) showed distributions with slight skewness and broader tails, indicating moderate non-normality. The Q-Q plots show clear deviations from normality at both ends of the distributions for all three directions.
Pelvis case
Histogram distributions of shifts in R-L, S-I, and A-P directions exhibit some asymmetry, with the R-L and S-I directions having broader tails and the A-P direction showing a tighter distribution. In the Q-Q Plots, there are visible deviations in the tails for all directions, suggesting non-normal distributions. The Shapiro-Wilk (S-W) test and skewness and kurtosis analysis were done as quantitative normality tests. The results of both the quantitative analysis of setup data for each direction and each tumor site were tabulated, respectively, in Table 3.The values of S-W statistics are not statistically significant as the p-values observed were more than 0.05 except for the brain case (R-L), thorax case (A-P), and the pelvis case (A-P) (section (a) in Table 3).
| (a) S-W test | ||||||||
| Tumor site | Brain | H and N | Thorax | Pelvis | ||||
| Statistics | p-value* | Statistics | p-value* | Statistics | p-value* | Statistics | p-value* | |
| R-L | 0.898 | 0.04 | 0.923 | 0.11 | 0.972 | 0.79 | 0.974 | 0.83 |
| S-I | 0.983 | 0.97 | 0.972 | 0.80 | 0.966 | 0.66 | 0.982 | 0.96 |
| A-P | 0.987 | 0.99 | 0.946 | 0.31 | 0.878 | 0.02 | 0.706 | 0.00 |
| (b) Skewness and Kurtosis analysis | ||||||
| Tumor site | Skewness | Kurtosis | ||||
| R-L | S-I | A-P | R-L | S-I | A-P | |
| Brain | 3.48 | 0.77 | -0.41 | 13.95 | -0.34 | 0.09 |
| H-N | 0.71 | 0.01 | -0.24 | 0.19 | -0.56 | -1.01 |
| Thorax | -0.15 | 0.74 | -1.16 | 0.14 | 0.68 | 2.54 |
| Pelvis | 0.17 | -0.22 | 0.11 | -0.37 | 0.23 | 6.15 |
Normality of setup error data was checked using (a) Shapiro-Wilk (S-W test) statistics and (b) skewness and kurtosis analysis. *p-value < 0.05 is considered statistically significant S-W: Shapiro-Wilk, H and N: Head and neck, R-L: Right-left (lateral), S-I: Superior-inferior (Longitudinal), A-P: Anterior-posterior (transverse).
However, an analysis of skewness and kurtosis was done in addition to checking the normality further. It was observed (section (b) in Table 3) that all the histograms are either positively or negatively skewed and also have kurtosis values deviated more or less from the standard value of a normal distribution.[24] For the R-Ldirection, setup data for the brain case deviates significantly from normality (confirmed by both the S-W test and skewness/kurtosis analysis). For the S-I direction, data for all tumorsites appear normally distributed, and for the A-P direction, thorax and pelvis data deviate significantly from normality (confirmed by the S-W test and kurtosis). Therefore, it can be said that the overall results of the S-W test [Table 3a] and skewness and kurtosis analysis [Table 3b]indicate that data sets were not necessarily normally distributed. Hence, the setup error data was evaluated by the unconventional approach, followed by the normality test.
Unconventional SM
In the unconventional method, histograms of daily setup data (i.e., the mean values of daily shifts across fractions for each patient) for the population were obtained for the brain case, H and N case, thorax case, and pelvis case. The 90% RSE values were estimated from the histogram data of each translational direction and tumor site. The values of 90% RSEs were considered directly as unconventional setup margins (SMs) (which consist of systematic and random uncertainty components) and tabulated in Table 4. Maximum values of unconventional SMs were 4.36 mm, 4.82 mm, 9.02 mm, and 6.6 mm for the brain, H and N, thorax, and pelvis cases, respectively (highlighted in Table 4).
| Tumor site | R-L | S-I | A-P |
|---|---|---|---|
| Brain | 0.73 | 4.36 | 3.59 |
| H and N | 2.41 | 4.82 | 4.21 |
| Thorax | 6.17 | 9.02 | 3.45 |
| Pelvis | 6.03 | 6.60 | 1.12 |
Unconventional setup margins (i.e., 90%RSE) were calculated using the percentile method directly from the setup error histogram data. R-L: Right-left (lateral direction), S-I: Superior-inferior (longitudinal direction), A-P: Anterior-posterior (transverse direction). Bold values indicate maximum unconventional setup margins for each tumor site.
Conventional versus Unconventional SM
The largest values of conventional SMs(using van Herk’s formalism) among three translational directions (i.e. R-L, S-I, A-P) across all tumor sites differ by 1.65 mm for the brain case, 1.94 mm for the H and N case, 3.63 mm for the thorax case, and 2.74 mm for the pelvis case in comparison with the unconventional SMs by comparing values from Tables 2 and 4. However, the largest values of conventional SMs for all the tumor sites except the pelvis case were observed in the S-I direction, whereas for the unconventional SMs, the highest values were observed in the S-I direction for all the sites. Also, the unconventional SM was highest for the thorax and least for the brain case, similar to the observation of the conventional methods (i.e., Thorax>Pelvis>H and N>Brain). A non-parametric paired test, i.e. Wilcoxon signed rank test,which is generally recommended to test two methods without assuming normality.[25] The statistical comparison of SMs estimated by the unconventional methodand the conventional method was tabulated in Table 5. For skewed setup error distributions found in this study, the median was prioritized over the mean as a measure of central tendency, as it is less influenced by outliers.[20] This aligns with recommendations for analyzing non-Gaussian data in clinical studies.[26] Therefore, median values of SMs for each translational direction (R-L, S-I, A-P) across all tumor sites,along with p-values, are given in Table 5. Median values of setup margins across all tumor sites and directions for the conventional method were 2.71 mm,whereas for the unconventional method, it was 4.21 mm.Comparison results were found statistically significant (p<0.05) except for the pelvis (A-P) case, and also the effect size (i.e., the difference between two means divided by pooled standard deviation, which translates statistical significance (p-value) into clinical relevance). Cohen’s d > 1 implies a large practical difference between methods.[27]
| Tumor site | Direction | Median (Unconventional) | Median (Conventional) | Cohen’s d | Z-score | p-value |
|---|---|---|---|---|---|---|
| Brain | R-L | 0.73 | 1.21 | 0.8 | -1.96 | 0.049* |
| S-I | 4.36 | 2.71 | 1.2 | -2.05 | 0.040* | |
| A-P | 3.59 | 2.64 | 0.9 | -1.99 | 0.046* | |
| H and N | R-L | 2.41 | 1.27 | 1.1 | -2.02 | 0.043* |
| S-I | 4.82 | 2.88 | 1.4 | -2.10 | 0.036* | |
| A-P | 4.21 | 2.46 | 1.3 | -2.01 | 0.044* | |
| Thorax | R-L | 6.17 | 4.38 | 1.5 | -2.03 | 0.042* |
| S-I | 9.02 | 5.39 | 2.1 | -2.20 | 0.028* | |
| A-P | 3.45 | 2.02 | 1.2 | -1.98 | 0.048* | |
| Pelvis | R-L | 6.03 | 3.86 | 1.3 | -2.04 | 0.041* |
| S-I | 6.60 | 3.70 | 1.7 | -2.15 | 0.032* | |
| A-P | 1.12 | 1.12 | 0.0 | 0.00 | 1.000 | |
| Overall [median across all sites] | — | 4.21 mm | 2.71 mm | |||
| Note |
• Median: the middle value in an ordered dataset, resistant to outliers. The median better represents “typical” errors than the mean for the skewed data. • Z-score measures how many standard deviations a data point is from the mean. In the Wilcoxon signed-rank test, it standardizes the test statistic (W) to compute the p-value. • The 90% RSE method consistently produced larger margins than van Herk’s (p < 0.05). • Thorax and pelvis showed the highest discrepancies, likely due to non-normal error distributions (skewness/kurtosis). • Largest differences: Thorax S-I (Δ = 3.63 mm, p = 0.028); Pelvis R-L (Δ = 2.17 mm, p = 0.041). • All comparisons were statistically significant* (p < 0.05) except pelvis A-P (identical medians). • Cohen’s d > 1.0 indicates a large practical difference between methods (thorax S-I: d = 2.1); Pelvis A-P showed no effect (d = 0.0) due to identical values. |
|||||
DISCUSSION
The setup accuracy varies widely, depending on the treatment site, method, or type of immobilization, and institute-specific protocols or methodologies.[28-30] The overall geometric uncertainty, consisting of systematic and random components, must be site-specific and equipment-specific[2,29] And also depends on the clinical experience or expertise of the individual.[28,31] Hence, the reported setup errors and setup margins in the various literature are indicative only. In the present work, a small number of patients (20) set up data across four tumor sites were evaluated in a short time period. This study explored the appropriateness of an unconventional, histogram-based setup margin estimation approach by systematically comparing the results with the existing conventional method (i.e., van Herk’s). Although it is recommended to take at least 20 patients' data for setup margin analysis,[20] the findings indicate that the unconventional method yields consistently larger SMs than the conventional method, especially in thorax and pelvis cases, highlighting its potential value in clinical settings where error distributions deviate from normality.
Systematic and random errors
Systematic setup errors calculated using the conventional method are less than 1 mm across all the tumor sites in this work [Table 1], which is similar to the previously reported EPID-based studies.[32-35] The maximum random errors for the brain, H and N, and pelvis cases found in this work [Table 1] are also comparable to the previous EPID-based studies.[36-38] However, for the thorax case, the maximum random error observed was 5.02 mm, which is larger compared to similar EPID-based studies.[32,39] Differences in results for random errors for the thorax may arise due to the differences in the expertise of individual radiation professionals.[28]
Conventional setup margins
The maximum conventional SMs for the brain, H and N, thorax,and pelvis cases [Table 2] found in this study are similar to the EPID-based multi-site study.[33] However, the results of the current study [Table 2] slightly differ from some other EPID-based single-site studies.[35-38] The small differences observed with other EPID-based studies may arise due to various factors such as institute-specific imaging protocol, individual experience or expertise, type of imaging modality,andimmobilization.[28,29,31,40] To find specific reasons, a robust and specific study may be needed, which was out of the scope of the present study. Moreover, results for the conventional margin analysis of the current study [Table 2] agree with CBCT-based previous studies[41,42] in the case of brain or H and N sites, but for the thorax or pelvis sites, SMs are smaller comparatively.Margins for the thorax or pelvis may be underestimated in the present EPID-based measurement due to the non-accountability of organ motion and rotational inaccuracy, which are more reliable and understandable in CBCT-based imaging.[13]
Normality test
Conventional methods of setup margin calculation assume that setup errors are distributed normally[4], Where as the unconventional method does not assume that setup errors are necessarily normally distributed.[10] The result of the S-W test [Table 3a] is statistically insignificant as the p-values are more than 0.05 except for the brain (R-L) case, thorax (A-P) case, and pelvis (A-P) case.However,samples from a normal population distribution do not necessarily appear normal, especially when the sample size is small.[21,43] Therefore, with the small data size evaluated in this work, it may be difficult to decide whether the data is normally distributed or not solely by the S-W statistics. In such a situation, a quantitative assessment of the skewness and kurtosis of the distribution and visual inspections using histograms and Q-Q plots, in addition, may be useful to determine the normality of the data.[21,24] The skew values and excess kurtosis values (i.e., kurtosis minus three) as given in Table 3b differed from their ideal values or reference values stated,[24] Indicating that the setup data sets deviate from normality. Moreover, by visual inspection of the histograms and Q-Q plots, it can be noted that histograms are not symmetric and not at all bell-shaped, whereas data points are mostly semi-linear in Q-Q plots [Figure 1]. Therefore, the setup error data sets in the present work do not necessarily belong to a normal distribution and hence, are eligible for unconventional margin estimation.
Unconventional setup margins
This study expands on the methodology proposed by Suzuki et al.[10], with key differences in imaging techniques, sample cancer cohort, which may be considered as a larger sample size, offering strong statistical robustness. In comparison, this study analyzed 200 images per group, which may have impacted the statistical results for normality and contributed to semi-linearity in Q-Q plots.[21] Another notable methodological divergence is data correction: Suzuki et al.[10] Adjusted their setup data by subtracting systematic sizes, and analytical approaches. Suzuki et al.[10] employed kilo-voltage (KV) imaging, integrating x-ray and optical infrared tracking for setup verification, while our study used a megavoltage (MV) Electronic Portal Imaging Device (EPID), which is widely available in routine clinical settings, especially in low-resource environments and smaller clinics.[44] Suzuki et al.[10] Analyzed 555 setup images for a prostate errors before constructing histograms, whereas we used uncorrected daily average shifts to generate histograms [Figure 1] and directly compute 90% ranges of setup errors (RSE). Further, in this work, unconventional setup margins (SMs) were compared against the conventional SMs, which were calculated using the 90% population-based van Herk’s method.[4] To ensure consistency, 90%RSE values were used in the unconventional SM estimations, whereas Suzuki et al.[10] used 95% RSEs. These methodological differences likely explain the lower unconventional SMs reported in the present study compared to Suzuki et al. [10] However, this direct comparison may not be enough because the present study is a multi-site evaluation, and unconventional setup margins (i.e., 90% RSEs) were compared with the existing conventional setup margins recipes, whereas Suzuki et al.[10] Compared the 95% RSEs with the 95% CI of the normal distribution for a specific tumorsite (prostate case only).
Conventional versus unconventional SMs
A statistical comparison [Table 5] between the unconventional SM approach and the conventional SM formalism (van Herk’s) was performed in this study using the Wilcoxon signed-rank test(a non-parametric paired test). Although the pattern of SMs (i.e., SMs for Thorax > Pelvis > H and N >Brain cases) is the same for both types of methods, which (i.e., pattern) agrees with previous studies[42,43], it can be observed from [Table 5] that the higher z-scores were obtained, which indicates stronger evidence against the null hypothesis (no difference). Hence, quantifies the magnitude of difference between the two methods (unconventional vs. conventional). It was found that error distributions are skewed in this study; hence, the median was prioritized over the mean as a measure of central tendency.[26] The differences in the median values for SMs [Table 5] indicate that the conventional method underestimates the setup margins compared to the unconventional method, with statistically significant results (p<0.05) in the present study. However, Cohen’s d values are reported for completeness, but a large effect size value (Cohen’s d) > 1.0 does not necessarily imply clinical importance To separate statistical from clinical importance, we pre-specified a clinically meaningful difference in margin as ≥ 2 mm, consistent with common IGRT precision considerations and the population-coverage objective underlying margin recipes Inferences about clinical relevance were based primarily on the pre-specified ≥ 2 mm threshold and anatomical context (dose fall-off near the PTV edge, proximity of OARs, and risk of target under-coverage), rather than on effect-size magnitude alone. Using this criterion, differences meeting ≥ 2 mm were observed for thorax S–I (Δ = 3.63 mm), pelvis S–I (Δ = 2.90 mm), and pelvis R–L (Δ = 2.17 mm); other site-direction pairs showed smaller discrepancies (< 2 mm). The median values of setup margins across tumor sites and direction evaluated by these two types of methods (unconventional vs conventional) differ by 1.5 mm,which may result in insufficient target coverage or excess dose to surrounding organs and hence impact clinical outcomes. However, in this EPID-based geometric uncertainty analysis, systematic (Ʃ) and random (σ) errors were defined exclusively as systematic and random setup errors, respectively, when applying van Herk’s margin formalism as discussed in the materials and methods section. Portal images capture the combined effects of setup errors and phantom transfer error, but not delineation error or organ motion error, as the organ itself is not visible in portal images.[12] This simplification introduces a limitation, as delineation error and organ motion error—neither measured nor included using literature references in this study—were excluded from conventional setup margin calculations and may have influenced the difference in SMs (unconventional vs. conventional).
The primary outcome of this study shows that SMs derived from conventional methods underestimate the margins when setup error data do not follow a normal distribution, as observed in this study. Moreover, the current estimation of EPID-based setup margins using the unconventional method is closer to or comparable to the results of advanced imaging modality (CBCT) based studies.[41,42] One of the most important advantages of the unconventional method is its adaptability to various data distributions. The 90% RSE approach, which does not rely on normality assumptions, inherently accommodates extreme values, offering a conservative estimate that safeguards against under-dosing. This attribute makes the unconventional method a viable option in resource-limited clinical environments, where patient data may be limited, or for early-phase clinical trials where standardizing setup procedures across a smaller cohort is challenging.In that sense, it may be stated that the unconventional method is more appropriate and practically useful when compared to the conventional method, where setup data is not normally distributed or in a situation where a smaller number of patient data is considered or available. However, the unconventional method used in this study is a statistical-geometrical methodology that depends more on the extremes (i.e., 5% and 95% edges) of the population distribution,whereas the conventional methods are based on either dose coverage probability or dose-population-based methodology, which considers the concept of finite penumbra width.[4,7] EPID captures bony, translational shifts only and does not quantify soft-tissue motion/deformation or rotations; true uncertainty—particularly in thorax/pelvis— may therefore be underestimated, and these unmeasured components likely explain part of the larger unconventional margins observed here.Moreover, EPID-based measurements may be considered inferior when compared to other high-end techniques like CBCT-based measurements.[45] These points may be considered as the limitations of the present work—practical adoption. In practice, we recommend using the unconventional percentile margin anisotropically where Δ ≥ 2 mm—in our cohort, thorax S–I and pelvis S–I/R–L—to safeguard CTV coverage in heavy-tail directions better while avoiding unnecessary expansion elsewhere. Therefore, to validate the appropriateness of the unconventional method over the conventional methods, more rigorous and robust investigations will be required.
CONCLUSION
It can be concluded that the unconventional method provides a conservative and clinically feasible alternative to traditional SM estimation techniques, especially in scenarios where setup errors deviate from a normal distribution. Although EPID-based imaging has its limitations, the results from this study suggest that the unconventional method could enhance treatment efficacy by providing more realistic margins in variable clinical contexts.However, before widespread adoption, validation with advanced imaging systems like CBCT and multi-center evaluations is recommended to establish its robustness and clinical utility in broader radiotherapy applications.
Acknowledgments:
1st and 2nd authors contributed equally to the work and should be considered co-first authors.
Author contributions:
KM and M: Equal contribution as co-first authors; data collection, EPID image analysis, statistical analysis, manuscript writing; AM and AV: Study design, methodology development, data interpretation, critical revision; SY: Statistical methodology, data validation; SD: Clinical guidance, patient data oversight; GP: Study conceptualization, supervision, data interpretation, final approval. All authors have read and approved the final manuscript.
Ethical approval:
Institutional Review Board approval is not required for retrospective study.
Declaration of patient consent:
Patient's consent not required as there are no patients in this study
Conflicts of interest:
There are no Conflicts of Interest
Use of artificial intelligence (AI)-assisted technology for manuscript preparation:
The authors confirm that there was no use of artificial intelligence (AI)-assisted technology for assisting in the writing or editing of the manuscript, and no images were manipulated using AI.
Financial support and sponsorship: Nil
References
- Image guidance in radiation therapy for better cure of cancer. Mol Oncol. 2020;14:1470-91.
- [CrossRef] [PubMed] [Google Scholar]
- On target: ensuring geometric accuracy in radiotherapy. 2008. London: Royal College of Radiologists; [accessed 2024 Jul 10] Available from: http://www.rcr.ac.uk/docs/oncology/pdf/BFCO(08)5_On_target.pdf
- [Google Scholar]
- Errors and margins in radiotherapy. Semin Radiat Oncol. 2004;14:52-64.
- [CrossRef] [PubMed] [Google Scholar]
- ICRU Report 62: Prescribing, Recording and Reporting Photon Beam Therapy (Supplement to ICRU 50) Bethesda, MD: ICRU; 1999.
- [Google Scholar]
- Geometrical uncertainties, radiotherapy planning margins, and the ICRU-62 report. Radiother Oncol. 2002;64:75-83.
- [CrossRef] [PubMed] [Google Scholar]
- The probability of correct target dosage: dose-population histograms for deriving treatment margins in radiotherapy. Int J Radiat Oncol Biol Phys. 2000;47:1121-1135.
- [CrossRef] [PubMed] [Google Scholar]
- Target margins in radiotherapy of prostate cancer. Br J Radiol. 2016;89:20160312.
- [CrossRef] [PubMed] [Google Scholar]
- Uncertainty in patient set-up margin analysis in radiation therapy. J Radiat Res. 2012;53:615-19.
- [CrossRef] [PubMed] [Google Scholar]
- The importance of small samples in medical research. J Postgrad Med. 2021;67:219-23.
- [CrossRef] [PubMed] [Google Scholar]
- Clinical value of styrofoam fixation in intracranial tumor radiotherapy. Front Oncol. 2023;13:1131006.
- [CrossRef] [PubMed] [Google Scholar]
- Breast patient setup error assessment: comparison of electronic portal image devices and cone-beam computed tomography matching results. Int J Radiat Oncol Biol Phys. 2010;78:1235-43.
- [CrossRef] [PubMed] [Google Scholar]
- ACR-ASTRO practice parameter for image-guided radiation therapy (IGRT) Am J Clin Oncol. 2020;43:459-68.
- [CrossRef] [PubMed] [Google Scholar]
- Phantom and in-vivo measurements of dose exposure by image-guided radiotherapy (IGRT): MV portal images vs. kV portal images vs. cone-beam CT. Radiother Oncol. 2007;85:418-23.
- [CrossRef] [PubMed] [Google Scholar]
- A new registration algorithm of electronic portal imaging devices images based on the automatic detection of bone edges during radiotherapy. Sci Rep. 2020;10:10253.
- [CrossRef] [PubMed] [Google Scholar]
- A literature review of electronic portal imaging for radiotherapy dosimetry. Radiother Oncol. 2008;88:289-309.
- [CrossRef] [PubMed] [Google Scholar]
- Comparison of setup accuracy of three different image assessment methods for tangential breast radiotherapy. J Med Radiat Sci. 2016;63:224-31.
- [CrossRef] [PubMed] [Google Scholar]
- 3-D portal image analysis in clinical practice: An evaluation of 2-D and 3-D analysis techniques as applied to 30 prostate cancer patients. Int J Radiat Oncol Biol Phys. 2000;46:1281-90.
- [CrossRef] [PubMed] [Google Scholar]
- An analysis of geometric uncertainty calculations for prostate radiotherapy in clinical practice. Br J Radiol. 2009;82:140-147.
- [CrossRef] [PubMed] [Google Scholar]
- The limitation of widely used data normality tests in clinical research. Aud Vest Res. 2022;31:1-3.
- [CrossRef] [Google Scholar]
- JASP: graphical statistical software for common statistical designs. J Stat Softw. 2019;88:1-10.
- [CrossRef] [Google Scholar]
- Statistical notes for clinical researchers: assessing normal distribution (2) using skewness and kurtosis. Restor Dent Endod. 2013;38:52-57.
- [CrossRef] [PubMed] [Google Scholar]
- The choice of statistical methods for comparisons of dosimetric data in radiotherapy. Radiat Oncol. 2014;9:205.
- [CrossRef] [PubMed] [Google Scholar]
- Choosing the appropriate measure of central tendency: mean, median, or mode? Knee Surg Sports Traumatol Arthrosc. 2023;31:12-15.
- [CrossRef] [PubMed] [Google Scholar]
- Using effect size-or why the p value is not enough. J Grad Med Educ. 2012;4:279-82.
- [CrossRef] [PubMed] [Google Scholar]
- Electronic portal imaging registration in breast cancer radiotherapy verification: analysis of inter-observer agreement among different categories of health practitioners. Neoplasma. 2013;60:302-08.
- [CrossRef] [PubMed] [Google Scholar]
- Set-up verification using portal imaging: review of current clinical practice. Radiother Oncol. 2001;58:105-20.
- [CrossRef] [PubMed] [Google Scholar]
- Factors impacting on patient setup analysis and error management during breast cancer radiotherapy. Crit Rev Oncol Hematol. 2022;178:103798.
- [CrossRef] [PubMed] [Google Scholar]
- ESTRO ACROP guidelines for positioning, immobilisation and position verification of head and neck patients for radiation therapists. Tech Innov Patient Support Radiat Oncol. 2017;1:1-7.
- [CrossRef] [PubMed] [Google Scholar]
- Set-up errors in radiotherapy for oesophageal cancers: Is electronic portal imaging or cone-beam more accurate? Radiother Oncol. 2011;98:249-54.
- [CrossRef] [PubMed] [Google Scholar]
- Evaluation of patient setup accuracy and determination of optimal setup margin for external beam radiation therapy using electronic portal imaging device. Cancer Ther Oncol Int J. 2018;11:555808.
- [CrossRef] [Google Scholar]
- Interfractional set-up errors evaluation by daily electronic portal imaging of IMRT in head and neck cancer patients. Acta Oncol. 2009;48:440-45.
- [CrossRef] [PubMed] [Google Scholar]
- A prospective analysis of inter-and intrafractional errors to calculate CTV to PTV margins in head and neck patients. Clin Transl Oncol. 2015;17:113-20.
- [CrossRef] [PubMed] [Google Scholar]
- Assessment of three-dimensional set-up errors in conventional head and neck radiotherapy using electronic portal imaging device. Radiat Oncol. 2007;2:44.
- [CrossRef] [PubMed] [Google Scholar]
- Evaluation of set-up errors and determination of set-up margin in pelvic radiotherapy by electronic portal imaging device. J Radiother Pract. 2020;19:150-56.
- [CrossRef] [Google Scholar]
- Evaluation of setup errors in conformal radiotherapy for pelvic tumours: case of the Regional Center of Oncology, Agadir. Radiat Med Prot. 2020;1:99-102.
- [CrossRef] [Google Scholar]
- Set-up uncertainty during breast radiotherapy: image-guided radiotherapy for patients with initial extensive variation. Strahlenther Onkol. 2013;189:315-20.
- [CrossRef] [PubMed] [Google Scholar]
- Comparison of setup accuracy of three different thermoplastic masks for the treatment of brain and head and neck tumors. Radiother Oncol. 2001;58:155-62.
- [CrossRef] [PubMed] [Google Scholar]
- Assessment of setup uncertainties for various tumor sites when using daily CBCT for more than 2200 VMAT treatments. J Appl Clin Med Phys. 2014;15:85-99.
- [CrossRef] [PubMed] [Google Scholar]
- Analysis of setup uncertainties and determination of the variation of the clinical target volume (CTV) to planning target volume (PTV) margin for various tumor sites treated with three-dimensional IGRT couch using KV-CBCT. J Radiat Oncol. 2020;9:25-35.
- [CrossRef] [Google Scholar]
- Statistical notes for clinical researchers: Assessing normal distribution (1) Restor Dent Endod. 2012;37:245-250.
- [CrossRef] [PubMed] [Google Scholar]
- An international survey of imaging practices in radiotherapy. Phys Med. 2021;90:53-65.
- [CrossRef] [PubMed] [Google Scholar]
- A comparison between electronic portal imaging device and cone beam CT in radiotherapy verification of nasopharyngeal carcinoma. Med Dosim. 2011;36:109-12.
- [CrossRef] [PubMed] [Google Scholar]

