pandera.Column

class pandera.Column(pandas_dtype=None, checks=None, nullable=False, allow_duplicates=True, coerce=False, required=True, name=None, regex=False)[source]

Validate types and properties of DataFrame columns.

Create column validator object.

Parameters
  • pandas_dtype (Union[str, type, PandasDtype, ExtensionDtype, None]) – datatype of the column. A PandasDtype for type-checking dataframe. If a string is specified, then assumes one of the valid pandas string values: http://pandas.pydata.org/pandas-docs/stable/basics.html#dtypes

  • checks (Union[Check, Hypothesis, List[Union[Check, Hypothesis]], None]) – checks to verify validity of the column

  • nullable (bool) – Whether or not column can contain null values.

  • allow_duplicates (bool) – Whether or not to coerce the column to the specified pandas_dtype before validation

  • coerce (bool) – If True, when schema.validate is called the column will be coerced into the specified dtype.

  • required (bool) – Whether or not column is allowed to be missing

  • name (Optional[str]) – column name in dataframe to validate.

  • regex (bool) – whether the name attribute should be treated as a regex pattern to apply to multiple columns in a dataframe.

Raises

SchemaInitError – if impossible to build schema from parameters

Example

>>> import pandas as pd
>>> import pandera as pa
>>>
>>>
>>> schema = pa.DataFrameSchema({
...     "column": pa.Column(pa.String)
... })
>>>
>>> schema.validate(pd.DataFrame({"column": ["foo", "bar"]}))
  column
0    foo
1    bar

See here for more usage details.

Attributes

allow_duplicates

Whether to allow duplicate values.

checks

Return list of checks or hypotheses.

coerce

Whether to coerce series to specified type.

dtype

String representation of the dtype.

name

Get SeriesSchema name.

nullable

Whether the series is nullable.

pandas_dtype

Get the pandas dtype

properties

Get column properties.

regex

True if name attribute should be treated as a regex pattern.

Methods

__init__

Create column validator object.

get_regex_columns

Get matching column names based on regex column name pattern.

set_name

Used to set or modify the name of a column object.

validate

Validate a Column in a DataFrame object.

__call__

Alias for validate method.