******************************
MultiLabelBinarizerTransformer
******************************

.. autoclass:: ds_utils.transformers.multi_label_binarizer.MultiLabelBinarizerTransformer
   :members:

.. highlight:: python

Code Examples
=============

The following examples show the three main ways to use ``MultiLabelBinarizerTransformer``.

Direct usage::

    from ds_utils.transformers.multi_label_binarizer import MultiLabelBinarizerTransformer

    X = [["sci-fi", "action"], ["romance"], ["action", "comedy"]]
    mlb = MultiLabelBinarizerTransformer()
    X_t = mlb.fit_transform(X)
    names = mlb.get_feature_names_out()

``X_t`` is a numpy array of shape ``(n_samples, n_classes)``, dtype ``float64``, with columns
corresponding to ``names`` (e.g. ``['label_action', 'label_comedy', 'label_romance', 'label_sci-fi']``).
The output will be:

+------------+------------+-------------+------------+
|label_action|label_comedy|label_romance|label_sci-fi|
+============+============+=============+============+
|1.0         |0.0         |0.0          |1.0         |
+------------+------------+-------------+------------+
|0.0         |0.0         |1.0          |0.0         |
+------------+------------+-------------+------------+
|1.0         |1.0         |0.0          |0.0         |
+------------+------------+-------------+------------+

Pipeline usage with pandas output::

    from ds_utils.transformers.multi_label_binarizer import MultiLabelBinarizerTransformer
    from sklearn.pipeline import Pipeline

    pipe = Pipeline([("mlb", MultiLabelBinarizerTransformer())])
    pipe.set_output(transform="pandas")
    df = pipe.fit_transform(X)

ColumnTransformer usage::

    from ds_utils.transformers.multi_label_binarizer import MultiLabelBinarizerTransformer
    from sklearn.compose import ColumnTransformer
    import pandas as pd

    df = pd.DataFrame({"tags": [["x", "y"], ["z"]], "num": [1.0, 2.0]})
    pre = ColumnTransformer(
        [("mlb", MultiLabelBinarizerTransformer(), ["tags"])],
        remainder="passthrough",
    )
    X_out = pre.fit_transform(df)