Skip to content

CIMissTest

The main test class. Compares classifier loss for predicting missingness with and without the outcome variable, combining results across multiple imputations and cross-validation folds.

CIMissTest(dataset, imputer=MidasImputer, classifier=RFClassifier, m=10, n_folds=10, classifier_args={}, imputer_args={}, random_state=42, target_level='variable', variance_method='mi_crossfit', subsample_cap=2000)

Conditional-independence-of-missingness test.

Compares classifier loss for predicting the missingness indicator with and without the outcome variable. A significant difference implies missingness is not conditionally independent of the outcome.

Parameters:

Name Type Description Default
dataset Dataset

Populated Dataset (call Dataset.make first).

required
imputer type[Imputer]

Imputer class to use (default MidasImputer).

MidasImputer
classifier type[CIClassifier]

Classifier class to use (default RFClassifier).

RFClassifier
m int

Number of multiply-imputed datasets.

10
n_folds int

Number of cross-validation folds.

10
classifier_args dict

Extra keyword arguments forwarded to the classifier.

{}
imputer_args dict

Extra keyword arguments forwarded to the imputer.

{}
variance_method str

'mi_crossfit' (default) or 'legacy_fold'.

'mi_crossfit'
subsample_cap int or None

Maximum number of rows to subsample for testing. Set to None to disable subsampling (default 2000).

2000

run()

Run the conditional independence test

Having declared the test object, this method will run the test

summary()

Print a summary of the test results