Classifiers¶
Per-column probabilistic classifiers used to predict missingness indicators. Each classifier wraps a scikit-learn estimator and fits a separate model per target column.
| Class | Estimator | Notes |
|---|---|---|
RFClassifier (default) |
Random forest | Auto-tunes max_features and min_samples_leaf |
ETClassifier |
Extra trees | Faster training; more variance |
LogisticClassifier |
Logistic regression | Assumes linear relationships |
RFClassifier¶
RFClassifier(n_estimators=100, max_features='auto', min_samples_leaf='auto', class_weight='balanced', n_features=None, target_n_jobs=1, n_jobs=None, random_state=None, **kwargs)
¶
Bases: ProbClassifier
sklearn.ensemble.RandomForestClassifier_ wrapper for CI testing.
Uses piecewise max_features heuristics based on n_features.
Set min_samples_leaf='auto' for adaptive leaf sizing.
.. _sklearn.ensemble.RandomForestClassifier: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html
ETClassifier¶
ETClassifier(n_estimators=100, max_features='auto', min_samples_leaf='auto', class_weight='balanced', n_features=None, target_n_jobs=1, n_jobs=None, random_state=None, **kwargs)
¶
Bases: ProbClassifier
sklearn.ensemble.ExtraTreesClassifier_ wrapper for CI testing.
Uses piecewise max_features heuristics based on n_features.
Set min_samples_leaf='auto' for adaptive leaf sizing.
.. _sklearn.ensemble.ExtraTreesClassifier: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html
LogisticClassifier¶
LogisticClassifier(penalty='l2', C=1000000.0, solver='lbfgs', max_iter=5000, random_state=None, n_features=None, target_n_jobs=1, **kwargs)
¶
Bases: ProbClassifier
sklearn.linear_model.LogisticRegression_ wrapper for CI testing.
.. _sklearn.linear_model.LogisticRegression: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html