pandera.api.pandas.components.MultiIndexΒΆ
- class pandera.api.pandas.components.MultiIndex(indexes, coerce=False, strict=False, name=None, ordered=True, unique=None)[source]ΒΆ
Validate types and properties of a pandas DataFrame MultiIndex.
This class inherits from
DataFrameSchemato leverage its validation logic.Create MultiIndex validator.
- Parameters:
indexes (
list[Index]) β list of Index validators for each level of the MultiIndex index.coerce (
bool) β Whether or not to coerce the MultiIndex to the specified dtypes before validationstrict (
bool) β whether or not to accept columns in the MultiIndex that arenβt defined in theindexesargument.ordered (
bool) β whether or not to validate the indexes order.unique (
Union[str,list[str],None]) β a list of index names that should be jointly unique.
- Example:
>>> import pandas as pd >>> import pandera.pandas as pa >>> >>> >>> schema = pa.DataFrameSchema( ... columns={"column": pa.Column(int)}, ... index=pa.MultiIndex([ ... pa.Index(str, ... pa.Check(lambda s: s.isin(["foo", "bar"])), ... name="index0"), ... pa.Index(int, name="index1"), ... ]) ... ) >>> >>> df = pd.DataFrame( ... data={"column": [1, 2, 3]}, ... index=pd.MultiIndex.from_arrays( ... [["foo", "bar", "foo"], [0, 1, 2]], ... names=["index0", "index1"], ... ) ... ) >>> >>> schema.validate(df) column index0 index1 foo 0 1 bar 1 2 foo 2 3
See here for more usage details.
Attributes
BACKEND_REGISTRYcoerceWhether or not to coerce data types.
dtypeGet the dtype property.
dtypesA dict where the keys are column names and values are
DataTypes for the column.named_indexesGet named indexes.
namesGet index names in the MultiIndex schema component.
propertiesGet the properties of the schema for serialization purposes.
uniqueList of columns that should be jointly unique.
Methods
- __init__(indexes, coerce=False, strict=False, name=None, ordered=True, unique=None)[source]ΒΆ
Create MultiIndex validator.
- Parameters:
indexes (
list[Index]) β list of Index validators for each level of the MultiIndex index.coerce (
bool) β Whether or not to coerce the MultiIndex to the specified dtypes before validationstrict (
bool) β whether or not to accept columns in the MultiIndex that arenβt defined in theindexesargument.ordered (
bool) β whether or not to validate the indexes order.unique (
Union[str,list[str],None]) β a list of index names that should be jointly unique.
- Example:
>>> import pandas as pd >>> import pandera.pandas as pa >>> >>> >>> schema = pa.DataFrameSchema( ... columns={"column": pa.Column(int)}, ... index=pa.MultiIndex([ ... pa.Index(str, ... pa.Check(lambda s: s.isin(["foo", "bar"])), ... name="index0"), ... pa.Index(int, name="index1"), ... ]) ... ) >>> >>> df = pd.DataFrame( ... data={"column": [1, 2, 3]}, ... index=pd.MultiIndex.from_arrays( ... [["foo", "bar", "foo"], [0, 1, 2]], ... names=["index0", "index1"], ... ) ... ) >>> >>> schema.validate(df) column index0 index1 foo 0 1 bar 1 2 foo 2 3
See here for more usage details.
- example(size=None)[source]ΒΆ
Generate an example of a particular size.
- Parameters:
size β number of elements in the generated DataFrame.
- Return type:
MultiIndex- Returns:
pandas DataFrame object.
- strategy(*, size=None)[source]ΒΆ
Create a
hypothesisstrategy for generating a DataFrame.- Parameters:
size β number of elements to generate
n_regex_columns β number of regex columns to generate.
- Returns:
a strategy that generates pandas DataFrame objects.
- __call__(dataframe, head=None, tail=None, sample=None, random_state=None, lazy=False, inplace=False)[source]ΒΆ
Alias for
DataFrameSchema.validate()method.- Parameters:
dataframe (pd.DataFrame) β the dataframe to be validated.
head (int) β validate the first n rows. Rows overlapping with tail or sample are de-duplicated.
tail (int) β validate the last n rows. Rows overlapping with head or sample are de-duplicated.
sample (
UnionType[int,None]) β validate a random sample of n rows. Rows overlapping with head or tail are de-duplicated.random_state (
UnionType[int,None]) β random seed for thesampleargument.lazy (
bool) β if True, lazily evaluates dataframe against all validation checks and raises aSchemaErrors. Otherwise, raiseSchemaErroras soon as one occurs.inplace (
bool) β if True, applies coercion to the object of validation, otherwise creates a copy of the data.
- Return type:
~TDataObject