Generate Error Analysis Report

The generate_error_analysis_report function provides a tabular error-analysis report that groups predictions by feature values and computes error metrics per group. It’s particularly useful for identifying specific feature ranges or categories where the model underperforms.

ds_utils.metrics.error_analysis.generate_error_analysis_report(X: DataFrame, y_true: ndarray, y_pred: ndarray, feature_columns: List[str] | None = None, bins: int = 10, threshold: float = 0.5, min_count: int = 1, sort_metric: str = 'error_rate', ascending: bool = False) DataFrame[source]

Generate a tabular error-analysis report grouped by feature values.

The report groups predictions by feature values and computes error metrics per group. For numerical features, values are binned into equal-width bins. For categorical features, raw values are used as groups.

Parameters:
  • X – Feature DataFrame.

  • y_true – True labels.

  • y_pred – Predicted labels.

  • feature_columns – List of columns to analyze. If None, all columns in X are used.

  • bins – Number of bins for numerical features.

  • threshold – Threshold for probability-based error definitions. Validated but not used in the current implementation; reserved for future probability-based error definitions.

  • min_count – Minimum number of samples in a group to be included in the report.

  • sort_metric – Metric to sort the report by. Valid options are: ‘feature’, ‘group’, ‘count’, ‘error_count’, ‘error_rate’, ‘accuracy’.

  • ascending – Whether to sort in ascending order.

Returns:

DataFrame containing the error analysis report.

Raises:
  • ValueError – If bins < 1, threshold not in [0, 1], min_count < 1, or invalid sort_metric.

  • KeyError – If any column in feature_columns is missing from X.

Code Example

import pandas as pd
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from ds_utils.metrics.error_analysis import generate_error_analysis_report

# Load dataset and split
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
X["size_category"] = pd.cut(
    X["mean radius"], bins=3, labels=["small", "medium", "large"]
).astype(str)
y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

# Train a classifier
clf = DecisionTreeClassifier(random_state=42, max_depth=3)
clf.fit(X_train[["mean radius", "mean texture"]], y_train)

y_pred = clf.predict(X_test[["mean radius", "mean texture"]])

# Generate error analysis report for numerical and categorical features
report = generate_error_analysis_report(
    X_test, y_test, y_pred,
    feature_columns=["mean radius", "mean texture", "size_category"],
    bins=3,
    sort_metric="error_rate",
    ascending=False
)
print(report)

The output will be a pandas DataFrame similar to this:

feature

group

count

error_count

error_rate

accuracy

mean radius

(16.71, 24.933]

30

3

0.100000

0.900000

size_category

large

15

1

0.066667

0.933333

mean texture

(25.32, 33.81]

17

1

0.058824

0.941176

mean texture

(16.83, 25.32]

78

4

0.051282

0.948718

mean texture

(8.315, 16.83]

48

2

0.041667

0.958333

size_category

medium

25

1

0.040000

0.960000

mean radius

(8.471, 16.71]

113

4

0.035398

0.964602

(Note: The size_category rows use raw string values as groups, while numerical features are binned. Rows with equal error_rate may appear in any order. Exact values and bins may vary based on data distribution.)