Generate Error Analysis Report
The generate_error_analysis_report function provides a tabular error-analysis report that groups predictions by feature values and computes error metrics per group. It’s particularly useful for identifying specific feature ranges or categories where the model underperforms.
- ds_utils.metrics.error_analysis.generate_error_analysis_report(X: DataFrame, y_true: ndarray, y_pred: ndarray, feature_columns: List[str] | None = None, bins: int = 10, threshold: float = 0.5, min_count: int = 1, sort_metric: str = 'error_rate', ascending: bool = False) DataFrame[source]
Generate a tabular error-analysis report grouped by feature values.
The report groups predictions by feature values and computes error metrics per group. For numerical features, values are binned into equal-width bins. For categorical features, raw values are used as groups.
- Parameters:
X – Feature DataFrame.
y_true – True labels.
y_pred – Predicted labels.
feature_columns – List of columns to analyze. If None, all columns in X are used.
bins – Number of bins for numerical features.
threshold – Threshold for probability-based error definitions. Validated but not used in the current implementation; reserved for future probability-based error definitions.
min_count – Minimum number of samples in a group to be included in the report.
sort_metric – Metric to sort the report by. Valid options are: ‘feature’, ‘group’, ‘count’, ‘error_count’, ‘error_rate’, ‘accuracy’.
ascending – Whether to sort in ascending order.
- Returns:
DataFrame containing the error analysis report.
- Raises:
ValueError – If bins < 1, threshold not in [0, 1], min_count < 1, or invalid sort_metric.
KeyError – If any column in feature_columns is missing from X.
Code Example
import pandas as pd
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from ds_utils.metrics.error_analysis import generate_error_analysis_report
# Load dataset and split
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
X["size_category"] = pd.cut(
X["mean radius"], bins=3, labels=["small", "medium", "large"]
).astype(str)
y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
# Train a classifier
clf = DecisionTreeClassifier(random_state=42, max_depth=3)
clf.fit(X_train[["mean radius", "mean texture"]], y_train)
y_pred = clf.predict(X_test[["mean radius", "mean texture"]])
# Generate error analysis report for numerical and categorical features
report = generate_error_analysis_report(
X_test, y_test, y_pred,
feature_columns=["mean radius", "mean texture", "size_category"],
bins=3,
sort_metric="error_rate",
ascending=False
)
print(report)
The output will be a pandas DataFrame similar to this:
feature |
group |
count |
error_count |
error_rate |
accuracy |
|---|---|---|---|---|---|
mean radius |
(16.71, 24.933] |
30 |
3 |
0.100000 |
0.900000 |
size_category |
large |
15 |
1 |
0.066667 |
0.933333 |
mean texture |
(25.32, 33.81] |
17 |
1 |
0.058824 |
0.941176 |
mean texture |
(16.83, 25.32] |
78 |
4 |
0.051282 |
0.948718 |
mean texture |
(8.315, 16.83] |
48 |
2 |
0.041667 |
0.958333 |
size_category |
medium |
25 |
1 |
0.040000 |
0.960000 |
mean radius |
(8.471, 16.71] |
113 |
4 |
0.035398 |
0.964602 |
(Note: The size_category rows use raw string values as groups, while numerical features are binned. Rows with equal error_rate may appear in any order. Exact values and bins may vary based on data distribution.)