pandera.strategies.pandas_strategiesΒΆ
Generate synthetic data from a schema definition.
new in 0.6.0
This module is responsible for generating data based on the type and check
constraints specified in a pandera schema. Itβs built on top of the
hypothesis package
to compose strategies given multiple checks specified in a schema.
See the user guide for more details.
- pandera.strategies.pandas_strategies.column_strategy(pandera_dtype, strategy=None, *, checks=None, unique=False, name=None)[source]ΒΆ
Create a data object describing a column in a DataFrame.
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.checks (
UnionType[Sequence,None]) β sequence ofChecks to constrain the values of the data in the column/index.unique (
bool) β whether or not generated Series contains unique values.
- Returns:
a column object.
- pandera.strategies.pandas_strategies.convert_dtype(array, col_dtype)[source]ΒΆ
Convert datatypes of an array (series or index).
- pandera.strategies.pandas_strategies.convert_dtypes(df, col_dtypes)[source]ΒΆ
Convert datatypes of a dataframe.
- pandera.strategies.pandas_strategies.dataframe_strategy(pandera_dtype=None, strategy=None, *, columns=None, checks=None, unique=None, index=None, size=None, n_regex_columns=1)[source]ΒΆ
Strategy to generate a pandas DataFrame.
- Parameters:
pandera_dtype (
UnionType[DataType,None]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β if specified, this will raise a BaseStrategyOnlyError, since it cannot be chained to a prior strategy.columns (
UnionType[dict,None]) β a dictionary where keys are column names and values areColumnobjects.checks (
UnionType[Sequence,None]) β sequence ofChecks to constrain the values of the data at the dataframe level.unique (
UnionType[list[str],None]) β a list of column names that should be jointly unique.index (
UnionType[Any,None]) β Index or MultiIndex schema component.size (
UnionType[int,None]) β number of elements in the Series.n_regex_columns (
int) β number of regex columns to generate.
- Returns:
hypothesisstrategy.
- pandera.strategies.pandas_strategies.eq_strategy(pandera_dtype, strategy=None, *, value)[source]ΒΆ
Strategy to generate a single value.
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.value (
Any) β value to generate.
- Return type:
SearchStrategy- Returns:
hypothesisstrategy
- pandera.strategies.pandas_strategies.field_element_strategy(pandera_dtype, strategy=None, *, checks=None)[source]ΒΆ
Strategy to generate elements of a column or index.
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.checks (
UnionType[Sequence,None]) β sequence ofChecks to constrain the values of the data in the column/index.
- Return type:
SearchStrategy- Returns:
hypothesisstrategy
- pandera.strategies.pandas_strategies.ge_strategy(pandera_dtype, strategy=None, *, min_value)[source]ΒΆ
Strategy to generate values greater than or equal to a minimum value.
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.min_value (
Union[int,float]) β generate values greater than or equal to this.
- Return type:
SearchStrategy- Returns:
hypothesisstrategy
- pandera.strategies.pandas_strategies.gt_strategy(pandera_dtype, strategy=None, *, min_value)[source]ΒΆ
Strategy to generate values greater than a minimum value.
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.min_value (
Union[int,float]) β generate values larger than this.
- Return type:
SearchStrategy- Returns:
hypothesisstrategy
- pandera.strategies.pandas_strategies.in_range_strategy(pandera_dtype, strategy=None, *, min_value, max_value, include_min=True, include_max=True)[source]ΒΆ
Strategy to generate values within a particular range.
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.min_value (
Union[int,float]) β generate values greater than this.max_value (
Union[int,float]) β generate values less than this.include_min (
bool) β include min_value in generated data.include_max (
bool) β include max_value in generated data.
- Return type:
SearchStrategy- Returns:
hypothesisstrategy
- pandera.strategies.pandas_strategies.index_strategy(pandera_dtype, strategy=None, *, checks=None, nullable=False, unique=False, name=None, size=None)[source]ΒΆ
Strategy to generate a pandas Index.
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.checks (
UnionType[Sequence,None]) β sequence ofChecks to constrain the values of the data in the column/index.nullable (
bool) β whether or not generated Series contains null values.unique (
bool) β whether or not generated Series contains unique values.size (
UnionType[int,None]) β number of elements in the Series.
- Returns:
hypothesisstrategy.
- pandera.strategies.pandas_strategies.isin_strategy(pandera_dtype, strategy=None, *, allowed_values)[source]ΒΆ
Strategy to generate values within a finite set.
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.
- Return type:
SearchStrategy- Returns:
hypothesisstrategy
- pandera.strategies.pandas_strategies.le_strategy(pandera_dtype, strategy=None, *, max_value)[source]ΒΆ
Strategy to generate values less than or equal to a maximum value.
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.max_value (
Union[int,float]) β generate values less than or equal to this.
- Return type:
SearchStrategy- Returns:
hypothesisstrategy
- pandera.strategies.pandas_strategies.lt_strategy(pandera_dtype, strategy=None, *, max_value)[source]ΒΆ
Strategy to generate values less than a maximum value.
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.max_value (
Union[int,float]) β generate values less than this.
- Return type:
SearchStrategy- Returns:
hypothesisstrategy
- pandera.strategies.pandas_strategies.multiindex_strategy(pandera_dtype=None, strategy=None, *, indexes=None, size=None)[source]ΒΆ
Strategy to generate a pandas MultiIndex object.
- Parameters:
pandera_dtype (
UnionType[DataType,None]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.indexes (
UnionType[list,None]) β a list ofIndexobjects.size (
UnionType[int,None]) β number of elements in the Series.
- Returns:
hypothesisstrategy.
- pandera.strategies.pandas_strategies.ne_strategy(pandera_dtype, strategy=None, *, value)[source]ΒΆ
Strategy to generate anything except for a particular value.
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.value (
Any) β value to avoid.
- Return type:
SearchStrategy- Returns:
hypothesisstrategy
- pandera.strategies.pandas_strategies.notin_strategy(pandera_dtype, strategy=None, *, forbidden_values)[source]ΒΆ
Strategy to generate values excluding a set of forbidden values
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.forbidden_values (
Sequence[Any]) β set of forbidden values.
- Return type:
SearchStrategy- Returns:
hypothesisstrategy
- pandera.strategies.pandas_strategies.numpy_complex_dtypes(dtype, min_value=0j, max_value=None, allow_infinity=None, allow_nan=None)[source]ΒΆ
Create numpy strategy for complex numbers.
- pandera.strategies.pandas_strategies.numpy_time_dtypes(dtype, min_value=None, max_value=None)[source]ΒΆ
Create numpy strategy for datetime and timedelta data types.
- Parameters:
dtype (
Union[dtype,DatetimeTZDtype]) β numpy datetime or timedelta datatypemin_value β minimum value of the datatype to create
max_value β maximum value of the datatype to create
- Returns:
hypothesisstrategy
- pandera.strategies.pandas_strategies.pandas_dtype_strategy(pandera_dtype, strategy=None, **kwargs)[source]ΒΆ
Strategy to generate data from a
pandera.dtypes.DataType.- Parameters:
pandera_dtype (
DataType) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.
- Kwargs:
key-word arguments passed into hypothesis.extra.numpy.from_dtype . For datetime, timedelta, and complex number datatypes, these arguments are passed into
numpy_time_dtypes()andnumpy_complex_dtypes().- Return type:
SearchStrategy- Returns:
hypothesisstrategy
- pandera.strategies.pandas_strategies.register_check_strategy(strategy_fn)[source]ΒΆ
Decorate a Check method with a strategy.
This should be applied to a built-in
Checkmethod.- Parameters:
strategy_fn (
Callable[β¦,SearchStrategy]) β add strategy to a check, using check statistics to generate ahypothesisstrategy.
- pandera.strategies.pandas_strategies.series_strategy(pandera_dtype, strategy=None, *, checks=None, nullable=False, unique=False, name=None, size=None)[source]ΒΆ
Strategy to generate a pandas Series.
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.checks (
UnionType[Sequence,None]) β sequence ofChecks to constrain the values of the data in the column/index.nullable (
bool) β whether or not generated Series contains null values.unique (
bool) β whether or not generated Series contains unique values.size (
UnionType[int,None]) β number of elements in the Series.
- Return type:
SearchStrategy- Returns:
hypothesisstrategy.
- pandera.strategies.pandas_strategies.str_contains_strategy(pandera_dtype, strategy=None, *, pattern)[source]ΒΆ
Strategy to generate strings that contain a particular pattern.
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.pattern (
str) β regex pattern.
- Return type:
SearchStrategy- Returns:
hypothesisstrategy
- pandera.strategies.pandas_strategies.str_endswith_strategy(pandera_dtype, strategy=None, *, string)[source]ΒΆ
Strategy to generate strings that end with a specific string pattern.
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.string (
str) β string pattern.
- Return type:
SearchStrategy- Returns:
hypothesisstrategy
- pandera.strategies.pandas_strategies.str_length_strategy(pandera_dtype, strategy=None, *, min_value, max_value)[source]ΒΆ
Strategy to generate strings of a particular length
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.min_value (
int) β minimum string length.max_value (
int) β maximum string length.
- Return type:
SearchStrategy- Returns:
hypothesisstrategy
- pandera.strategies.pandas_strategies.str_matches_strategy(pandera_dtype, strategy=None, *, pattern)[source]ΒΆ
Strategy to generate strings that patch a regex pattern.
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.pattern (
str) β regex pattern.
- Return type:
SearchStrategy- Returns:
hypothesisstrategy
- pandera.strategies.pandas_strategies.str_startswith_strategy(pandera_dtype, strategy=None, *, string)[source]ΒΆ
Strategy to generate strings that start with a specific string pattern.
- Parameters:
pandera_dtype (
Union[DataType,DataType]) βpandera.dtypes.DataTypeinstance.strategy (
UnionType[SearchStrategy,None]) β an optional hypothesis strategy. If specified, the pandas dtype strategy will be chained onto this strategy.string (
str) β string pattern.
- Return type:
SearchStrategy- Returns:
hypothesisstrategy