pandera.schema_components.Column¶
-
class
pandera.schema_components.
Column
(pandas_dtype=None, checks=None, nullable=False, allow_duplicates=True, coerce=False, required=True, name=None, regex=False)[source]¶ Validate types and properties of DataFrame columns.
Create column validator object.
- Parameters
pandas_dtype (
Union
[str
,type
,PandasDtype
,ExtensionDtype
,dtype
,None
]) – datatype of the column. APandasDtype
for type-checking dataframe. If a string is specified, then assumes one of the valid pandas string values: http://pandas.pydata.org/pandas-docs/stable/basics.html#dtypeschecks (
Union
[Check
,Hypothesis
,List
[Union
[Check
,Hypothesis
]],None
]) – checks to verify validity of the columnnullable (
bool
) – Whether or not column can contain null values.allow_duplicates (
bool
) – Whether or not column can contain duplicate values.coerce (
bool
) – If True, when schema.validate is called the column will be coerced into the specified dtype.required (
bool
) – Whether or not column is allowed to be missingname (
Optional
[str
]) – column name in dataframe to validate.regex (
bool
) – whether thename
attribute should be treated as a regex pattern to apply to multiple columns in a dataframe.
- Raises
SchemaInitError – if impossible to build schema from parameters
- Example
>>> import pandas as pd >>> import pandera as pa >>> >>> >>> schema = pa.DataFrameSchema({ ... "column": pa.Column(pa.String) ... }) >>> >>> schema.validate(pd.DataFrame({"column": ["foo", "bar"]})) column 0 foo 1 bar
See here for more usage details.
Attributes
allow_duplicates
Whether to allow duplicate values.
checks
Return list of checks or hypotheses.
coerce
Whether to coerce series to specified type.
dtype
String representation of the dtype.
has_subcomponents
name
Get SeriesSchema name.
nullable
Whether the series is nullable.
pandas_dtype
Get the pandas dtype
pdtype
PandasDtype of the series.
properties
Get column properties.
regex
True if
name
attribute should be treated as a regex pattern.Methods
Create column validator object.
Coerce dtype of a column, handling duplicate column names.
Generate an example of a particular size.
Get matching column names based on regex column name pattern.
Used to set or modify the name of a column object.
Create a
hypothesis
strategy for generating a Column.Generate column data object for use by DataFrame strategy.
Validate a Column in a DataFrame object.
Alias for
validate
method.