CIMissTest¶
The main test class. Compares classifier loss for predicting missingness with and without the outcome variable, combining results across multiple imputations and cross-validation folds.
CIMissTest(dataset, imputer=MidasImputer, classifier=RFClassifier, m=10, n_folds=10, classifier_args={}, imputer_args={}, random_state=42, target_level='variable', variance_method='mi_crossfit', subsample_cap=2000)
¶
Conditional-independence-of-missingness test.
Compares classifier loss for predicting the missingness indicator with and without the outcome variable. A significant difference implies missingness is not conditionally independent of the outcome.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset
|
Dataset
|
Populated |
required |
imputer
|
type[Imputer]
|
Imputer class to use (default |
MidasImputer
|
classifier
|
type[CIClassifier]
|
Classifier class to use (default |
RFClassifier
|
m
|
int
|
Number of multiply-imputed datasets. |
10
|
n_folds
|
int
|
Number of cross-validation folds. |
10
|
classifier_args
|
dict
|
Extra keyword arguments forwarded to the classifier. |
{}
|
imputer_args
|
dict
|
Extra keyword arguments forwarded to the imputer. |
{}
|
variance_method
|
str
|
|
'mi_crossfit'
|
subsample_cap
|
int or None
|
Maximum number of rows to subsample for testing. Set to |
2000
|