类 | |
class | CDependenceMaximization |
Class CDependenceMaximization, base class for all feature selection preprocessors which select a subset of features that shows maximum dependence between the features and the labels. This is done via an implementation of CIndependenceTest, m_estimator inside compute_measures() (see class documentation of CFeatureSelection), which performs a statistical test for a given feature \(\mathbf{X}_i\) from the set of features \(\mathbf{X}\), and the labels \(\mathbf{Y}\). The test checks \[ \textbf{H}_0 : P\left(\mathbf{X}\setminus \mathbf{X}_i, \mathbf{Y}\right) =P\left(\mathbf{X}\setminus \mathbf{X}_i\right)P\left(\mathbf{Y}\right) \] The test statistic is then used as a measure. Lowest measure signifies lowest dependence. Therefore, lowest scoring features are removed. The removal policy thus can only be N_SMALLEST and PERCENTILE_SMALLEST and it can be set via set_policy() call. remove_feats() method handles the removal of features based on the specified policy. 更多... | |