pandera.api.pandas.components.MultiIndexΒΆ
- class pandera.api.pandas.components.MultiIndex(indexes, coerce=False, strict=False, name=None, ordered=True, unique=None)[source]ΒΆ
Validate types and properties of a pandas DataFrame MultiIndex.
This class inherits from
DataFrameSchemato leverage its validation logic.Create MultiIndex validator.
- Parameters:
indexes (
List[Index]) β list of Index validators for each level of the MultiIndex index.coerce (
bool) β Whether or not to coerce the MultiIndex to the specified dtypes before validationstrict (
bool) β whether or not to accept columns in the MultiIndex that arenβt defined in theindexesargument.ordered (
bool) β whether or not to validate the indexes order.unique (
Union[str,List[str],None]) β a list of index names that should be jointly unique.
- Example:
>>> import pandas as pd >>> import pandera as pa >>> >>> >>> schema = pa.DataFrameSchema( ... columns={"column": pa.Column(int)}, ... index=pa.MultiIndex([ ... pa.Index(str, ... pa.Check(lambda s: s.isin(["foo", "bar"])), ... name="index0"), ... pa.Index(int, name="index1"), ... ]) ... ) >>> >>> df = pd.DataFrame( ... data={"column": [1, 2, 3]}, ... index=pd.MultiIndex.from_arrays( ... [["foo", "bar", "foo"], [0, 1, 2]], ... names=["index0", "index1"], ... ) ... ) >>> >>> schema.validate(df) column index0 index1 foo 0 1 bar 1 2 foo 2 3
See here for more usage details.
Attributes
BACKEND_REGISTRYcoerceWhether or not to coerce data types.
dtypeGet the dtype property.
dtypesA dict where the keys are column names and values are
DataTypes for the column.namesGet index names in the MultiIndex schema component.
propertiesGet the properties of the schema for serialization purposes.
uniqueList of columns that should be jointly unique.
Methods
- __init__(indexes, coerce=False, strict=False, name=None, ordered=True, unique=None)[source]ΒΆ
Create MultiIndex validator.
- Parameters:
indexes (
List[Index]) β list of Index validators for each level of the MultiIndex index.coerce (
bool) β Whether or not to coerce the MultiIndex to the specified dtypes before validationstrict (
bool) β whether or not to accept columns in the MultiIndex that arenβt defined in theindexesargument.ordered (
bool) β whether or not to validate the indexes order.unique (
Union[str,List[str],None]) β a list of index names that should be jointly unique.
- Example:
>>> import pandas as pd >>> import pandera as pa >>> >>> >>> schema = pa.DataFrameSchema( ... columns={"column": pa.Column(int)}, ... index=pa.MultiIndex([ ... pa.Index(str, ... pa.Check(lambda s: s.isin(["foo", "bar"])), ... name="index0"), ... pa.Index(int, name="index1"), ... ]) ... ) >>> >>> df = pd.DataFrame( ... data={"column": [1, 2, 3]}, ... index=pd.MultiIndex.from_arrays( ... [["foo", "bar", "foo"], [0, 1, 2]], ... names=["index0", "index1"], ... ) ... ) >>> >>> schema.validate(df) column index0 index1 foo 0 1 bar 1 2 foo 2 3
See here for more usage details.
- example(size=None)[source]ΒΆ
Generate an example of a particular size.
- Parameters:
size β number of elements in the generated DataFrame.
- Return type:
- Returns:
pandas DataFrame object.
- strategy(*, size=None)[source]ΒΆ
Create a
hypothesisstrategy for generating a DataFrame.- Parameters:
size β number of elements to generate
n_regex_columns β number of regex columns to generate.
- Returns:
a strategy that generates pandas DataFrame objects.
- __call__(dataframe, head=None, tail=None, sample=None, random_state=None, lazy=False, inplace=False)[source]ΒΆ
Alias for
DataFrameSchema.validate()method.- Parameters:
dataframe (pd.DataFrame) β the dataframe to be validated.
head (int) β validate the first n rows. Rows overlapping with tail or sample are de-duplicated.
tail (int) β validate the last n rows. Rows overlapping with head or sample are de-duplicated.
sample (
Optional[int]) β validate a random sample of n rows. Rows overlapping with head or tail are de-duplicated.random_state (
Optional[int]) β random seed for thesampleargument.lazy (
bool) β if True, lazily evaluates dataframe against all validation checks and raises aSchemaErrors. Otherwise, raiseSchemaErroras soon as one occurs.inplace (
bool) β if True, applies coercion to the object of validation, otherwise creates a copy of the data.
- Return type:
~TDataObject