pandera API extensions

new in 0.6.0

This module provides utilities for extending the pandera API.

class pandera.extensions.CheckType(value)[source]

Bases: enum.Enum

Check types for registered check methods.


Check applied to a Series or DataFrame


Check applied to an element of a Series or DataFrame


Check applied to dictionary of Series or DataFrames.

pandera.extensions.register_check_method(check_fn=None, *, statistics=None, supported_types=(<class 'pandas.core.frame.DataFrame'>, <class 'pandas.core.series.Series'>), check_type='vectorized', strategy=None)[source]

Registers a function as a Check method.

See the user guide for more details.

  • check_fn – check function to register. The function should take one positional argument for the object to validate and additional keyword-only arguments for the check statistics.

  • statistics (Optional[List[str]]) – list of keyword-only arguments in the check_fn, which serve as the statistics needed to serialize/de-serialize the check and generate data if a strategy function is provided.

  • supported_types (Union[type, Tuple, List]) – the pandas type(s) supported by the check function. Valid values are pd.DataFrame, pd.Series, or a list/tuple of (pa.DataFrame, pa.Series) if both types are supported.

  • check_type (Union[CheckType, str]) –

    the expected input of the check function. Valid values are CheckType enums or {"vectorized", "element_wise", "groupby"}. The input signature of check_fn is determined by this argument:

    • if vectorized, the first positional argument of check_fn should be one of the supported_types.

    • if element_wise, the first positional argument of check_fn should be a single scalar element in the pandas Series or DataFrame.

    • if groupby, the first positional argument of check_fn should be a dictionary mapping group names to subsets of the Series or DataFrame.

  • strategy – data-generation strategy associated with the check function.


register check function wrapper.