pandera.schemas.DataFrameSchema.select_columns

DataFrameSchema.select_columns(columns)[source]

Select subset of columns in the schema.

New in version 0.4.5

Parameters

columns (List[str]) – list of column names to select.

Return type

DataFrameSchema

Returns

DataFrameSchema (copy of original) with only the selected columns.

Raises

SchemaInitError if column not in the schema.

Example

To subset a schema by column, and return a new schema:

>>> import pandera as pa
>>>
>>> example_schema = pa.DataFrameSchema({
...     "category" : pa.Column(pa.String),
...     "probability": pa.Column(pa.Float)
... })
>>>
>>> print(example_schema.select_columns(['category']))
DataFrameSchema(
    columns={
        "category": "<Schema Column: 'category' type=str>"
    },
    checks=[],
    index=None,
    coerce=False,
    strict=False
)

Note

If an index is present in the schema, it will also be included in the new schema.