# chi2:
Discretization using the Chi2 algorithm

## Description

This function performs Chi2 discretization algorithm. Chi2 algorithm automatically determines a proper Chi-sqaure($\chi^2$) threshold that keeps the fidelity of the original numeric dataset.
## Usage

chi2(data, alp = 0.5, del = 0.05)

## Arguments

data

the dataset to be discretize

alp

significance level; $\alpha$

del

$Inconsistency(data)< \delta$, (Liu and Setiono(1995))

## Value

- cutp
- list of cut-points for each variable
- Disc.data
- discretized data matrix

## Details

The Chi2 algorithm is based on the $\chi^2$ statistic, and consists of two phases.
In the first phase, it begins with a high significance level(sigLevel), for all numeric attributes for discretization. Each attribute is sorted according to its values. Then the following is performed:
**phase 1.** calculate the $\chi^2$ value for every pair of adjacent intervals (at the beginning, each pattern is put into its own interval that contains only one value of an attribute);
**pahse 2.** merge the pair of adjacent intervals with the lowest $\chi^2$ value. Merging continues until all pairs of intervals have $\chi^2$ values exceeding the parameter determined by sigLevel. The above process is repeated with a decreased sigLevel until an *inconsistency rate*($\delta$), `incon()`

, is exceeded in the discretized data(Liu and Setiono (1995)).
## References

Liu, H. and Setiono, R. (1995). Chi2: Feature selection and discretization of numeric attributes, *Tools with Artificial Intelligence*, 388--391.Liu, H. and Setiono, R. (1997). Feature selection and discretization, *IEEE transactions on knowledge and data engineering*, **Vol.9, no.4**, 642--645.

## Examples

data(iris)
#---cut-points
chi2(iris,0.5,0.05)$cutp
#--discretized dataset using Chi2 algorithm
chi2(iris,0.5,0.05)$Disc.data