pandera.dtypes.PandasDtype¶
-
class
pandera.dtypes.
PandasDtype
(value)[source]¶ Bases:
enum.Enum
Enumerate all valid pandas data types.
pandera
follows the numpy data types subscribed to bypandas
and by default supports using the numpy data type string aliases to validate DataFrame or Series dtypes.This class simply enumerates the valid numpy dtypes for pandas arrays. For convenience
PandasDtype
enums can all be accessed in the top-levelpandera
name space via the same enum name.- Examples
>>> import pandas as pd >>> import pandera as pa >>> >>> >>> pa.SeriesSchema(pa.Int).validate(pd.Series([1, 2, 3])) 0 1 1 2 2 3 dtype: int64 >>> pa.SeriesSchema(pa.Float).validate(pd.Series([1.1, 2.3, 3.4])) 0 1.1 1 2.3 2 3.4 dtype: float64 >>> pa.SeriesSchema(pa.String).validate(pd.Series(["a", "b", "c"])) 0 a 1 b 2 c dtype: object
Alternatively, you can use built-in python scalar types for integers, floats, booleans, and strings:
>>> pa.SeriesSchema(int).validate(pd.Series([1, 2, 3])) 0 1 1 2 2 3 dtype: int64
You can also use the pandas string aliases in the schema definition:
>>> pa.SeriesSchema("int").validate(pd.Series([1, 2, 3])) 0 1 1 2 2 3 dtype: int64
Note
pandera
also offers limited support for pandas extension types, however since the release of pandas 1.0.0 there are backwards incompatible extension types like theInteger
array. The extension types, e.g.pd.IntDtype64()
and their string alias should work when supplied to thepandas_dtype
argument, unless otherwise specified below, but this functionality is only tested for pandas >= 1.0.0. Extension types in earlier versions are not guaranteed to work as thepandas_dtype
argument in schemas or schema components.Attributes
Bool
"bool"
numpy dtypeCategory
pandas
"categorical"
datatypeComplex
"complex"
numpy dtypeComplex128
"complex"
numpy dtypeComplex256
"complex"
numpy dtypeComplex64
"complex"
numpy dtypeDateTime
"datetime64[ns]"
numpy dtypeFloat
"float"
numpy dtypeFloat16
"float16"
numpy dtypeFloat32
"float32"
numpy dtypeFloat64
"float64"
numpy dtypeINT16
"Int16"
pandas dtype: pandas 0.24.0+INT32
"Int32"
pandas dtype: pandas 0.24.0+INT64
"Int64"
pandas dtype: pandas 0.24.0+INT8
"Int8"
pandas dtype:: pandas 0.24.0+Int
"int"
numpy dtypeInt16
"int16"
numpy dtypeInt32
"int32"
numpy dtypeInt64
"int64"
numpy dtypeInt8
"int8"
numpy dtypeObject
"object"
numpy dtypeSTRING
"string"
pandas dtypes: pandas 1.0.0+.String
"str"
numpy dtypeTimedelta
"timedelta64[ns]"
numpy dtypeUINT16
"UInt16"
pandas dtype: pandas 0.24.0+UINT32
"UInt32"
pandas dtype: pandas 0.24.0+UINT64
"UInt64"
pandas dtype: pandas 0.24.0+UINT8
"UInt8"
pandas dtype: pandas 0.24.0+UInt16
"uint16"
numpy dtypeUInt32
"uint32"
numpy dtypeUInt64
"uint64"
numpy dtypeUInt8
"uint8"
numpy dtype-
str_alias
¶ Get datatype string alias.
-
classmethod
from_str_alias
(str_alias)[source]¶ Get PandasDtype from string alias.
- Param
pandas dtype string alias from https://pandas.pydata.org/pandas-docs/stable/getting_started/basics.html#basics-dtypes
- Return type
- Returns
pandas dtype
-
classmethod
from_pandas_api_type
(pandas_api_type)[source]¶ Get PandasDtype enum from pandas api type.
- Parameters
pandas_api_type (
str
) – string output from https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.api.types.infer_dtype.html- Return type
- Returns
pandas dtype