pandera.api.pandas.array.SeriesSchema.validate#

SeriesSchema.validate(check_obj, head=None, tail=None, sample=None, random_state=None, lazy=False, inplace=False)[source]#

Validate a Series object.

Parameters
  • check_obj (Series) – One-dimensional ndarray with axis labels (including time series).

  • head (Optional[int]) – validate the first n rows. Rows overlapping with tail or sample are de-duplicated.

  • tail (Optional[int]) – validate the last n rows. Rows overlapping with head or sample are de-duplicated.

  • sample (Optional[int]) – validate a random sample of n rows. Rows overlapping with head or tail are de-duplicated.

  • random_state (Optional[int]) – random seed for the sample argument.

  • lazy (bool) – if True, lazily evaluates dataframe against all validation checks and raises a SchemaErrors. Otherwise, raise SchemaError as soon as one occurs.

  • inplace (bool) – if True, applies coercion to the object of validation, otherwise creates a copy of the data.

Return type

Series

Returns

validated Series.

Raises

SchemaError – when DataFrame violates built-in or custom checks.

Example

>>> import pandas as pd
>>> import pandera as pa
>>>
>>> series_schema = pa.SeriesSchema(
...     float, [
...         pa.Check(lambda s: s > 0),
...         pa.Check(lambda s: s < 1000),
...         pa.Check(lambda s: s.mean() > 300),
...     ])
>>> series = pd.Series([1, 100, 800, 900, 999], dtype=float)
>>> print(series_schema.validate(series))
0      1.0
1    100.0
2    800.0
3    900.0
4    999.0
dtype: float64