pandera.schemas.DataFrameSchema.set_index

DataFrameSchema.set_index(keys, drop=True, append=False)[source]

A method for setting the Index of a DataFrameSchema, via an existing Column or list of columns.

Parameters
  • keys (List[str]) – list of labels

  • drop (bool) – bool, default True

  • append (bool) – bool, default False

Return type

DataFrameSchema

Returns

a new DataFrameSchema with specified column(s) in the index.

Raises

SchemaInitError if column not in the schema.

Examples

Just as you would set the index in a pandas DataFrame from an existing column, you can set an index within the schema from an existing column in the schema.

>>> import pandera as pa
>>>
>>> example_schema = pa.DataFrameSchema({
...     "category" : pa.Column(pa.String),
...     "probability": pa.Column(pa.Float)})
>>>
>>> print(example_schema.set_index(['category']))
<Schema DataFrameSchema(
    columns={
        'probability': <Schema Column(name=probability, type=float)>
    },
    checks=[],
    coerce=False,
    pandas_dtype=None,
    index=<Schema Index(name=category, type=str)>,
    strict=False
    name=None,
    ordered=False
)>

If you have an existing index in your schema, and you would like to append a new column as an index to it (yielding a Multiindex), just use set_index as you would in pandas.

>>> example_schema = pa.DataFrameSchema(
...     {
...         "column1": pa.Column(pa.String),
...         "column2": pa.Column(pa.Int)
...     },
...     index=pa.Index(name = "column3", pandas_dtype = pa.Int)
... )
>>>
>>> print(example_schema.set_index(["column2"], append = True))
<Schema DataFrameSchema(
    columns={
        'column1': <Schema Column(name=column1, type=str)>
    },
    checks=[],
    coerce=False,
    pandas_dtype=None,
    index=<Schema MultiIndex(
        indexes=[
            <Schema Index(name=column3, type=int)>
            <Schema Index(name=column2, type=int)>
        ]
        coerce=False,
        strict=False,
        name=None,
        ordered=True
    )>,
    strict=False
    name=None,
    ordered=False
)>

See also

reset_index()