class pandera.api.pandas.components.Column(dtype=None, checks=None, parsers=None, nullable=False, unique=False, report_duplicates='all', coerce=False, required=True, name=None, regex=False, title=None, description=None, default=None, metadata=None, drop_invalid_rows=False)[source]

Validate types and properties of pandas DataFrame columns.

Create column validator object.

  • dtype (Union[str, type, DataType, Type, ExtensionDtype, dtype]) – datatype of the column. The datatype for type-checking a dataframe. If a string is specified, then assumes one of the valid pandas string values:

  • checks (Union[Check, List[Union[Check, Hypothesis]], None]) – checks to verify validity of the column

  • parsers (Union[Parser, List[Parser], None]) – parsers to verify validity of the column

  • nullable (bool) – Whether or not column can contain null values.

  • unique (bool) – whether column values should be unique

  • report_duplicates (Union[Literal[‘exclude_first’], Literal[‘exclude_last’], Literal[‘all’]]) – how to report unique errors - exclude_first: report all duplicates except first occurence - exclude_last: report all duplicates except last occurence - all: (default) report all duplicates

  • coerce (bool) – If True, when schema.validate is called the column will be coerced into the specified dtype. This has no effect on columns where dtype=None.

  • required (bool) – Whether or not column is allowed to be missing

  • name (Union[str, Tuple[str, …], None]) – column name in dataframe to validate.

  • regex (bool) – whether the name attribute should be treated as a regex pattern to apply to multiple columns in a dataframe.

  • title (Optional[str, None]) – A human-readable label for the column.

  • description (Optional[str, None]) – An arbitrary textual description of the column.

  • default (Optional[Any, None]) – The default value for missing values in the column.

  • metadata (Optional[dict, None]) – An optional key value data.

  • drop_invalid_rows (bool) – if True, drop invalid rows on validation.


SchemaInitError – if impossible to build schema from parameters


>>> import pandas as pd
>>> import pandera as pa
>>> schema = pa.DataFrameSchema({
...     "column": pa.Column(str)
... })
>>> schema.validate(pd.DataFrame({"column": ["foo", "bar"]}))
0    foo
1    bar

See here for more usage details.




Get the pandas dtype


Get column properties.



Create column validator object.


Generate an example of a particular size.


Get matching column names based on regex column name pattern.


Used to set or modify the name of a column object.


Create a hypothesis strategy for generating a Column.


Generate column data object for use by DataFrame strategy.


Alias for validate method.