Comparison of Chi-square based algorithms for discretization of continuous chicken egg quality traits
Discretization is a data pre-processing task for transforming continuous variables into discrete ones. In this study, four Chi-square based supervised discretization algorithms (ChiMerge, Chi2, Extended Chi2 and Modified Chi2) were compared for discretization of the fourteen continuous variables in a chicken egg quality traits dataset. We found that all of the algorithms had similar performances in term of training model accuracies obtained with C5.0 classification tree algorithm whereas ChiMerge and Chi2 were better than the remaining algorithms in term of training error rates. The numbers of intervals obtained with Chi2 tended to be large while they were very small in Extended Chi2 and Modified Chi2. The numbers of intervals from ChiMerge increased as the significance level increases whereas they were the same at all the levels of significance for the remaining algorithms. Consequently, it was revealed that ChiMerge at the significance levels of 0.05 and 0.10 was more efficient than the others and could be a better choice in discretization of the egg quality traits.