pandera.schemas.DataFrameSchema.select_columns

DataFrameSchema.select_columns(columns)[source]

Select subset of columns in the schema.

New in version 0.4.5

Parameters

columns (List[str]) – list of column names to select.

Return type

DataFrameSchema

Returns

DataFrameSchema (copy of original) with only the selected columns.

Raises

SchemaInitError if column not in the schema.

Example

To subset a schema by column, and return a new schema:

>>> import pandera as pa
>>>
>>> example_schema = pa.DataFrameSchema({
...     "category" : pa.Column(pa.String),
...     "probability": pa.Column(pa.Float)
... })
>>>
>>> print(example_schema.select_columns(['category']))
<Schema DataFrameSchema(
    columns={
        'category': <Schema Column(name=category, type=str)>
    },
    checks=[],
    coerce=False,
    pandas_dtype=None,
    index=None,
    strict=False
    name=None,
    ordered=False
)>

Note

If an index is present in the schema, it will also be included in the new schema.