Usage¶

There are two interfaces to the library. Anything that actually does something useful, like plotting the grid scores or a learning curve, is available as a top-level function.

There are convenience classes built on top of these functions, These classes take a fit estimator, and the training data (optionally test data). The classes provide caching of predicted values and a few other conveniences.

Post-estimation reporting methods.

class postlearn.reporter.ClassificationResults(model, X_train, y_train, X_test=None, y_test=None, labels=None)¶

A convinience class, wrapping all the reporting methods and caching intermediate calculations.

plot_roc_curve(*, ax=None, y_true=None, y_score=None)¶: Plot the ROC.

proba_test¶: Predicted probabilities for the test set

proba_train¶: Predicted probabilities for the training set

y_pred_test¶: Predicted values for the test set

y_pred_train¶: Predicted values for the training set

y_score_test¶: Predicted positive score (column 1) for the test set

y_score_train¶: Predicted positive score (column 1) for the training set

class postlearn.reporter.GridSearchMixin¶: Helper methods appropriate for estimators fit with a GridSearch.

postlearn.reporter.confusion_matrix(y_true=None, y_pred=None, labels=None)¶

Dataframe of confusion matrix. Rows are actual, and columns are predicted.

Parameters:

y_true : array

y_pred : array

labels : list-like

Returns:

confusion_matrix : DataFrame

postlearn.reporter.default_args(**attrs)¶

Pull the defaults for a method from self.

Parameters:

attrs : dict

mapping parameter name to attribute name Attributes with the same name need not be included.

Returns:

deco: new function, injecting the attrs into kwargs

Notes

Only usable with keyword-only arguments.

Examples

@default_args({‘y’: ‘y_train’}) def printer(self, *, y=None, y_pred=None):

print(‘y: ‘, y) print(‘y_pred: ‘, y_pred)

postlearn.reporter.extract_grid_scores(model)¶

Extract grid scores from a model or pipeline.

Parameters:

model : Estimator or Pipeline

must end in sklearn.grid_search.GridSearchCV

Returns:

scores : list

See also

plot_grid_scores

Examples

>>> from sklearn.ensemble import RandomForestClassifier
>>> from sklearn import datasets
>>> from sklearn.grid_search import GridSearchCV
>>> from sklearn.preprocessing import StandardScaler
>>> X, y =datasets.make_classification()
>>> model = GridSearchCV(RandomForestClassifier(),
...                      param_grid={
...                          'n_estimators': [10, 20, 30],
...                          'max_features': [.1, .5, 1]
...                      })
>>> model.fit(X, y)
>>> unpack_grid_scores(model)
   mean_      std_  max_features  n_estimators
0   0.88  0.062416           0.1            10
1   0.88  0.046536           0.1            20
2   0.85  0.095309           0.1            30
3   0.88  0.062686           0.5            10
4   0.91  0.072044           0.5            20
5   0.90  0.073366           0.5            30
6   0.78  0.032929           1.0            10
7   0.86  0.048224           1.0            20
8   0.85  0.072174           1.0            30