pandera.api.xarray.container.DataArraySchema¶
- class pandera.api.xarray.container.DataArraySchema(dtype=None, dims=None, ordered_dims=True, sizes=None, shape=None, coords=None, attrs=None, name=None, checks=None, parsers=None, coerce=False, nullable=False, chunked=None, array_type=None, strict_coords=False, strict_attrs=False, encoding=None, title=None, description=None, metadata=None)[source]¶
A lightweight xarray DataArray validator.
Initialize a DataArraySchema.
- Parameters:
dims (
Union[tuple[UnionType[str,None], …],list[UnionType[str,None]],dict[str,str],None]) – dimension names. Can be a list of dimension names or a dict mapping dimension names to dimension types.ordered_dims (
bool) – if True (default),dimsvalidation is positional — order must match. If False, only the set of dim names is checked.sizes (
UnionType[dict[str,UnionType[int,None]],None]) – size requirements for dimensions.shape (
UnionType[tuple[UnionType[int,None], …],None]) – shape requirements for the DataArray.coords (
UnionType[dict[str,Any],list[str],None]) – coordinate specifications.attrs (
UnionType[dict[str,Any],type[BaseModel],None]) – attribute specifications. Can be adict[str, Any]where values are literal (equality), regex strings starting with^(pattern match), or callables(value) -> bool. Alternatively, pass apydantic.BaseModelclass to validate the full attrs dict against the model’s schema.checks (
Union[Check,list[Union[Check,Hypothesis]],None]) – checks applied to the whole DataArray (after structure).parsers (
Union[Parser,list[Parser],None]) – parsers applied to the whole DataArray before checks.coerce (
bool) – whether or not to coerce data.nullable (
bool) – whether the DataArray can contain null values.chunked (
UnionType[bool,None]) – if True, require a Dask-backed array; if False, require eager data; if None, do not check.array_type (
UnionType[Any,None]) – expected type of underlying array (e.g.numpy.ndarray).strict_coords (
Union[bool,Literal[‘filter’]]) – whether to enforce strict coordinate validation.strict_attrs (
Union[bool,Literal[‘filter’]]) – whether to enforce strict attribute validation.encoding (
UnionType[dict[str,Any],type[BaseModel],None]) – expected per-variable encoding key-value pairs. Validated againstda.encoding, which is populated when reading from netCDF/Zarr (common keys:_FillValue,dtype,scale_factor,add_offset,zlib,complevel,units,calendar). Can be adict[str, Any]where values are literal (equality), regex strings starting with^, or callables(value) -> bool. Alternatively, pass apydantic.BaseModelclass to validate the full encoding dict against the model’s schema.title (
UnionType[str,None]) – A human-readable label for the schema.description (
UnionType[str,None]) – An arbitrary textual description of the schema.metadata (
UnionType[dict,None]) – An optional key-value data.
Attributes
BACKEND_REGISTRYpropertiesGet the properties of the schema for serialization purposes.
Methods
- __init__(dtype=None, dims=None, ordered_dims=True, sizes=None, shape=None, coords=None, attrs=None, name=None, checks=None, parsers=None, coerce=False, nullable=False, chunked=None, array_type=None, strict_coords=False, strict_attrs=False, encoding=None, title=None, description=None, metadata=None)[source]¶
Initialize a DataArraySchema.
- Parameters:
dims (
Union[tuple[UnionType[str,None], …],list[UnionType[str,None]],dict[str,str],None]) – dimension names. Can be a list of dimension names or a dict mapping dimension names to dimension types.ordered_dims (
bool) – if True (default),dimsvalidation is positional — order must match. If False, only the set of dim names is checked.sizes (
UnionType[dict[str,UnionType[int,None]],None]) – size requirements for dimensions.shape (
UnionType[tuple[UnionType[int,None], …],None]) – shape requirements for the DataArray.coords (
UnionType[dict[str,Any],list[str],None]) – coordinate specifications.attrs (
UnionType[dict[str,Any],type[BaseModel],None]) – attribute specifications. Can be adict[str, Any]where values are literal (equality), regex strings starting with^(pattern match), or callables(value) -> bool. Alternatively, pass apydantic.BaseModelclass to validate the full attrs dict against the model’s schema.checks (
Union[Check,list[Union[Check,Hypothesis]],None]) – checks applied to the whole DataArray (after structure).parsers (
Union[Parser,list[Parser],None]) – parsers applied to the whole DataArray before checks.coerce (
bool) – whether or not to coerce data.nullable (
bool) – whether the DataArray can contain null values.chunked (
UnionType[bool,None]) – if True, require a Dask-backed array; if False, require eager data; if None, do not check.array_type (
UnionType[Any,None]) – expected type of underlying array (e.g.numpy.ndarray).strict_coords (
Union[bool,Literal[‘filter’]]) – whether to enforce strict coordinate validation.strict_attrs (
Union[bool,Literal[‘filter’]]) – whether to enforce strict attribute validation.encoding (
UnionType[dict[str,Any],type[BaseModel],None]) – expected per-variable encoding key-value pairs. Validated againstda.encoding, which is populated when reading from netCDF/Zarr (common keys:_FillValue,dtype,scale_factor,add_offset,zlib,complevel,units,calendar). Can be adict[str, Any]where values are literal (equality), regex strings starting with^, or callables(value) -> bool. Alternatively, pass apydantic.BaseModelclass to validate the full encoding dict against the model’s schema.title (
UnionType[str,None]) – A human-readable label for the schema.description (
UnionType[str,None]) – An arbitrary textual description of the schema.metadata (
UnionType[dict,None]) – An optional key-value data.
- classmethod from_json(source)[source]¶
Load schema from JSON (see
pandera.io.xarray_io).- Return type:
- classmethod from_yaml(yaml_schema)[source]¶
Load schema from YAML (see
pandera.io.xarray_io).- Return type:
- static register_default_backends(check_obj_cls)[source]¶
Register default backends.
This method is invoked in the get_backend method so that the appropriate validation backend is loaded at validation time instead of schema-definition time.
This method needs to be implemented by the schema subclass.
- to_json(target=None, *, minimal=True, **kwargs)[source]¶
Write schema to JSON (see
pandera.io.xarray_io).
- validate(check_obj, head=None, tail=None, sample=None, random_state=None, lazy=False, inplace=False)[source]¶
Validate a DataArray based on the schema specification.
- Parameters:
check_obj (
DataArray) – the DataArray to be validated.head (
UnionType[int,None]) – validate the firstnpositions along the first dimension only (see backend subsampling).tail (
UnionType[int,None]) – validate the lastnpositions along the first dimension.sample (
UnionType[int,None]) – random subset of sizenalong the first dimension.random_state (
UnionType[int,None]) – random seed for thesampleargument.lazy (
bool) – if True, lazily evaluates DataArray against all validation checks and raises aSchemaErrors. Otherwise, raiseSchemaErroras soon as one occurs.inplace (
bool) – if True, applies coercion to the object of validation, otherwise creates a copy of the data.
- Return type:
- Returns:
validated
DataArray- Raises:
SchemaError – when
DataArrayviolates built-in or custom checks.
Chunked (Dask-backed) arrays default to
SCHEMA_ONLYfor data-level checks unlessvalidation_depthis set (seepandera.api.xarray.utils.get_validation_depth()).