pandera.schema_components.Column

class pandera.schema_components.Column(pandas_dtype=None, checks=None, nullable=False, allow_duplicates=True, coerce=False, required=True, name=None, regex=False)[source]

Validate types and properties of DataFrame columns.

Create column validator object.

Parameters
  • pandas_dtype (Union[str, type, PandasDtype, ExtensionDtype, None]) – datatype of the column. A PandasDtype for type-checking dataframe. If a string is specified, then assumes one of the valid pandas string values: http://pandas.pydata.org/pandas-docs/stable/basics.html#dtypes

  • checks (Union[Check, Hypothesis, List[Union[Check, Hypothesis]], None]) – checks to verify validity of the column

  • nullable (bool) – Whether or not column can contain null values.

  • allow_duplicates (bool) – Whether or not column can contain duplicate values.

  • coerce (bool) – If True, when schema.validate is called the column will be coerced into the specified dtype.

  • required (bool) – Whether or not column is allowed to be missing

  • name (Optional[str]) – column name in dataframe to validate.

  • regex (bool) – whether the name attribute should be treated as a regex pattern to apply to multiple columns in a dataframe.

Raises

SchemaInitError – if impossible to build schema from parameters

Example

>>> import pandas as pd
>>> import pandera as pa
>>>
>>>
>>> schema = pa.DataFrameSchema({
...     "column": pa.Column(pa.String)
... })
>>>
>>> schema.validate(pd.DataFrame({"column": ["foo", "bar"]}))
  column
0    foo
1    bar

See here for more usage details.

Attributes

allow_duplicates

Whether to allow duplicate values.

checks

Return list of checks or hypotheses.

coerce

Whether to coerce series to specified type.

dtype

String representation of the dtype.

has_subcomponents

name

Get SeriesSchema name.

nullable

Whether the series is nullable.

pandas_dtype

Get the pandas dtype

pdtype

PandasDtype of the series.

properties

Get column properties.

regex

True if name attribute should be treated as a regex pattern.

Methods

__init__

Create column validator object.

coerce_dtype

Coerce dtype of a column, handling duplicate column names.

example

Generate an example of a particular size.

get_regex_columns

Get matching column names based on regex column name pattern.

set_name

Used to set or modify the name of a column object.

strategy

Create a hypothesis strategy for generating a Column.

strategy_component

Generate column data object for use by DataFrame strategy.

validate

Validate a Column in a DataFrame object.

__call__

Alias for validate method.