Metrics

The module of metrics contains methods that help to calculate and/or visualize evaluation performance of an algorithm.

Plot Confusion Matrix

metrics.plot_confusion_matrix(y_test: numpy.ndarray, y_pred: numpy.ndarray, labels: List[Union[str, int]], sample_weight: Optional[List[float]] = None, annot_kws=None, cbar=True, cbar_kws=None, **kwargs) → matplotlib.axes._axes.Axes[source]

Computes and plot confusion matrix, False Positive Rate, False Negative Rate, Accuracy and F1 score of a classification.

Parameters:
  • y_test – array, shape = [n_samples]. Ground truth (correct) target values.
  • y_pred – array, shape = [n_samples]. Estimated targets as returned by a classifier.
  • labels – array, shape = [n_classes]. List of labels to index the matrix. This may be used to reorder or select a subset of labels.
  • sample_weight

    array-like of shape = [n_samples], optional

    Sample weights.

  • annot_kws

    dict of key, value mappings, optional

    Keyword arguments for ax.text.

  • cbar

    boolean, optional

    Whether to draw a colorbar.

  • cbar_kws

    dict of key, value mappings, optional

    Keyword arguments for figure.colorbar

  • kwargs

    other keyword arguments

    All other keyword arguments are passed to matplotlib.axes.Axes.pcolormesh().

Returns:

Returns the Axes object with the matrix drawn onto it.

Code Examples

In following examples we are going to use the iris dataset from scikit-learn. so firstly let’s import it:

import numpy
from sklearn import datasets

IRIS = datasets.load_iris()
RANDOM_STATE = numpy.random.RandomState(0)

Next we’ll add a small function to add noise:

def _add_noisy_features(x, random_state):
    n_samples, n_features = x.shape
    return numpy.c_[x, random_state.randn(n_samples, 200 * n_features)]

Binary Classification

So We’ll use the only first two classes in the iris dataset, build a SVM classifier and evaluate it:

from matplotlib import pyplot
from sklearn.model_selection import train_test_split
from sklearn import svm

from ds_utils.metrics import plot_confusion_matrix


x = IRIS.data
y = IRIS.target

# Add noisy features
x = _add_noisy_features(x, RANDOM_STATE)

# Limit to the two first classes, and split into training and test
x_train, x_test, y_train, y_test = train_test_split(x[y < 2], y[y < 2], test_size=.5,
                                        random_state=RANDOM_STATE)

# Create a simple classifier
classifier = svm.LinearSVC(random_state=RANDOM_STATE)
classifier.fit(x_train, y_train)
y_pred = classifier.predict(x_test)

plot_confusion_matrix(y_test, y_pred, [1, 0])

pyplot.show()

And the following image will be shown:

binary classification confusion matrix

Multi-Label Classification

This time we’ll train on all the classes and plot an evaluation:

from matplotlib import pyplot
from sklearn.model_selection import train_test_split
from sklearn.multiclass import OneVsRestClassifier
from sklearn import svm

from ds_utils.metrics import plot_confusion_matrix


x = IRIS.data
y = IRIS.target

# Add noisy features
x = _add_noisy_features(x, RANDOM_STATE)

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=.5, random_state=RANDOM_STATE)

# Create a simple classifier
classifier = OneVsRestClassifier(svm.LinearSVC(random_state=RANDOM_STATE))
classifier.fit(x_train, y_train)
y_pred = classifier.predict(x_test)

plot_confusion_matrix(y_test, y_pred, [0, 1, 2])
pyplot.show()

And the following image will be shown:

multi label classification confusion matrix

Plot Metric Growth per Labeled Instances

metrics.plot_metric_growth_per_labeled_instances(X_train: numpy.ndarray, y_train: numpy.ndarray, X_test: numpy.ndarray, y_test: numpy.ndarray, classifiers_dict: Dict[str, sklearn.base.ClassifierMixin], n_samples: Optional[List[int]] = None, quantiles: Optional[List[float]] = [0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.39999999999999997, 0.44999999999999996, 0.49999999999999994, 0.5499999999999999, 0.6, 0.65, 0.7, 0.75, 0.7999999999999999, 0.85, 0.9, 0.95, 1.0], metric: Callable[[numpy.ndarray, numpy.ndarray], float] = <function accuracy_score>, random_state: Union[int, numpy.random.mtrand.RandomState, None] = None, n_jobs: Optional[int] = None, verbose: int = 0, pre_dispatch: Union[int, str, None] = '2*n_jobs', *, ax: Optional[matplotlib.axes._axes.Axes] = None, **kwargs) → matplotlib.axes._axes.Axes[source]

Receives a train and test sets, and plots given metric change in increasing amount of trained instances.

Parameters:
  • X_train – {array-like or sparse matrix} of shape (n_samples, n_features) The training input samples.
  • y_train – 1d array-like, or label indicator array / sparse matrix The target values (class labels) as integers or strings.
  • X_test – {array-like or sparse matrix} of shape (n_samples, n_features) The test or evaluation input samples.
  • y_test – 1d array-like, or label indicator array / sparse matrix Predicted labels, as returned by a classifier.
  • classifiers_dict – mapping from classifier name to classifier object.
  • n_samples – List of numbers of samples for training batches, optional (default=None).
  • quantiles – List of percentages of samples for training batches, optional (default=[0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 1]. Used when n_samples=None.
  • metric – sklearn.metrics api function which receives y_true and y_pred and returns float.
  • random_state

    int, RandomState instance or None, optional (default=None)

    The seed of the pseudo random number generator to use when shuffling the data.

    • If int, random_state is the seed used by the random number generator;
    • If RandomState instance, random_state is the random number generator;
    • If None, the random number generator is the RandomState instance initiated with seed zero.
  • n_jobs

    int or None, optional (default=None)

    Number of jobs to run in parallel.

    • None means 1 unless in a joblib.parallel_backend context.
    • -1 means using all processors.
  • verbose – integer. Controls the verbosity: the higher, the more messages.
  • pre_dispatch

    int, or string, optional

    Controls the number of jobs that get dispatched during parallel execution. Reducing this number can be useful to avoid an explosion of memory consumption when more jobs get dispatched than CPUs can process. This parameter can be:

    • None, in which case all the jobs are immediately created and spawned. Use this for lightweight and fast-running jobs, to avoid delays due to on-demand spawning of the jobs
    • An int, giving the exact number of total jobs that are spawned
    • A string, giving an expression as a function of n_jobs, as in ‘2*n_jobs’
  • ax – Axes object to draw the plot onto, otherwise uses the current Axes.
  • kwargs

    other keyword arguments

    All other keyword arguments are passed to matplotlib.axes.Axes.pcolormesh().

Returns:

Returns the Axes object with the plot drawn onto it.

Code Example

In this example we’ll divide the data into train and test sets, decide on which classifiers we want to measure and plot the results:

from matplotlib import pyplot
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier

from ds_utils.metrics import plot_metric_growth_per_labeled_instances


x = IRIS.data
y = IRIS.target

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=.3, random_state=0)
plot_metric_growth_per_labeled_instances(x_train, y_train, x_test, y_test,
                                         {"DecisionTreeClassifier":
                                            DecisionTreeClassifier(random_state=0),
                                          "RandomForestClassifier":
                                            RandomForestClassifier(random_state=0, n_estimators=5)})
pyplot.show()

And the following image will be shown:

Features Visualization

Visualize Accuracy Grouped by Probability

This method was created due the lack of maintenance of the package EthicalML / xai.

metrics.visualize_accuracy_grouped_by_probability(y_test: numpy.ndarray, labeled_class: Union[str, int], probabilities: numpy.ndarray, threshold: float = 0.5, display_breakdown: bool = False, bins: Union[int, Sequence[float], pandas.core.indexes.interval.IntervalIndex, None] = None, *, ax: Optional[matplotlib.axes._axes.Axes] = None, **kwargs) → matplotlib.axes._axes.Axes[source]

Receives test true labels and classifier probabilities predictions, divide and classify the results and finally plots a stacked bar chart with the results.

Parameters:
  • y_test – array, shape = [n_samples]. Ground truth (correct) target values.
  • labeled_class – the class to enquire for.
  • probabilities – array, shape = [n_samples]. classifier probabilities for the labeled class.
  • threshold – the probability threshold for classifying the labeled class.
  • display_breakdown – if True the results will be displayed as “correct” and “incorrect”; otherwise as “true-positives”, “true-negative”, “false-positives” and “false-negative”
  • bins

    int, sequence of scalars, or IntervalIndex

    The criteria to bin by.

    • int : Defines the number of equal-width bins in the range of x. The range of x is extended by .1% on each side to include the minimum and maximum values of x.
    • sequence of scalars : Defines the bin edges allowing for non-uniform width. No extension of the range of x is done.
    • IntervalIndex : Defines the exact bins to be used. Note that IntervalIndex for bins must be non-overlapping.

    default: [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1]

  • ax – Axes object to draw the plot onto, otherwise uses the current Axes.
  • kwargs

    other keyword arguments

    All other keyword arguments are passed to matplotlib.axes.Axes.pcolormesh().

Returns:

Returns the Axes object with the plot drawn onto it.

Code Example

The example uses a small sample from of a dataset from kaggle, which a dummy bank provides loans.

Let’s see how to use the code:

from matplotlib import pyplot
from sklearn.ensemble import RandomForestClassifier


from ds_utils.metrics import visualize_accuracy_grouped_by_probability


loan_data = pandas.read_csv(path/to/dataset, encoding="latin1", nrows=11000,
                             parse_dates=["issue_d"])
                             .drop("id", axis=1)
loan_data = loan_data.drop("application_type", axis=1)
loan_data = loan_data.sort_values("issue_d")
loan_data = pandas.get_dummies(loan_data)
train = loan_data.head(int(loan_data.shape[0] * 0.7)).sample(frac=1)
        .reset_index(drop=True).drop("issue_d", axis=1)
test = loan_data.tail(int(loan_data.shape[0] * 0.3)).drop("issue_d", axis=1)

selected_features = ['emp_length_int', 'home_ownership_MORTGAGE', 'home_ownership_RENT',
                     'income_category_Low', 'term_ 36 months', 'purpose_debt_consolidation',
                     'purpose_small_business', 'interest_payments_High']
classifier = RandomForestClassifier(min_samples_leaf=int(train.shape[0] * 0.01),
                                    class_weight="balanced",
                                    n_estimators=1000, random_state=0)
classifier.fit(train[selected_features], train["loan_condition_cat"])

probabilities = classifier.predict_proba(test[selected_features])
visualize_accuracy_grouped_by_probability(test["loan_condition_cat"], 1, probabilities[:, 1],
                                          display_breakdown=False)

pyplot.show()

And the following image will be shown:

Visualize Accuracy Grouped by Probability

If we chose to display the breakdown:

visualize_accuracy_grouped_by_probability(test["loan_condition_cat"], 1, probabilities[:, 1],
                                          display_breakdown=True)
pyplot.show()

And the following image will be shown:

Visualize Accuracy Grouped by Probability with Breakdown