XYOnion generates representative calibration and validation subsets by combining distances in both the predictor (X) and response (y) spaces, and assigning samples in layered shells based on this combined metric. This layered structure ensures balanced coverage across the entire data space and prevents extrapolation in the validation subset, leading to more robust model assessments. XYOnion produces more realistic and stable figures of merit than other splitting algorithms like Kennard-Stone or SPXY, effectively avoiding the overly optimistic performance estimates that arise from unbalanced or non-representative splits.
Dr. Jokin Ezenarro [1]
[1]Department of Food Science, University of Copenhagen, Frederiksberg, Denmark