pandera.api.dataframe.model.DataFrameModel¶

class pandera.api.dataframe.model.DataFrameModel(*args, **kwargs)[source]¶

Base class for the DataFrame model.

See the User Guide for more.

Validate a DataFrame based on the schema specification.

Parameters:

check_obj (pd.DataFrame) – the dataframe to be validated.
head – validate the first n rows. Rows overlapping with tail or sample are de-duplicated.
tail – validate the last n rows. Rows overlapping with head or sample are de-duplicated.
sample – validate a random sample of n rows. Rows overlapping with head or tail are de-duplicated.
random_state – random seed for the sample argument.
lazy – if True, lazily evaluates dataframe against all validation checks and raises a SchemaErrors. Otherwise, raise SchemaError as soon as one occurs.
inplace – if True, applies coercion to the object of validation, otherwise creates a copy of the data.

Returns:

validated DataFrame

Raises:

SchemaError – when DataFrame violates built-in or custom checks.

Methods

classmethod build_schema_(**kwargs)[source]¶

classmethod empty(*_args)[source]¶

Create an empty DataFrame instance.

classmethod example(cls, **kwargs)[source]¶

Generate an example of this data model specification.

classmethod from_json(source)[source]¶

Load a schema from JSON.

classmethod from_yaml(yaml_schema)[source]¶

Load a schema from YAML.

classmethod get_metadata()[source]¶

Provide metadata for columns and schema level

classmethod pydantic_validate(schema_model)[source]¶

Verify that the input is a compatible dataframe model.

classmethod strategy(cls, **kwargs)[source]¶: Create a data synthesis strategy.

classmethod to_json(target=None, **kwargs)[source]¶: Convert this model’s schema to JSON.

classmethod to_json_schema()[source]¶: Serialize schema metadata into json-schema format.

classmethod to_schema()[source]¶

Create DataFrameSchema from the DataFrameModel.

classmethod to_yaml(stream=None)[source]¶: Convert this model’s schema to YAML.

classmethod validate(check_obj, head=None, tail=None, sample=None, random_state=None, lazy=False, inplace=False)[source]¶