Configuration¶
Validation depth and Dask / chunked data¶
Pandera uses ValidationDepth for xarray the same way
it does for Polars lazy frames:
SCHEMA_ONLY— only structural validation (dims, dtype, coords, attrs, name, shape). Data-levelCheckobjects are skipped.DATA_ONLY— only data-level checks.SCHEMA_AND_DATA— full validation (default for eager arrays).
Chunked (Dask-backed) arrays¶
When an array is backed by Dask (i.e. da.chunks is not None), data-level
checks would trigger .compute(), which may be expensive. To avoid
surprises, chunked arrays default to SCHEMA_ONLY when no explicit depth
is set. Eager (NumPy-backed) arrays default to SCHEMA_AND_DATA.
Opting in to data checks on Dask arrays¶
Set the validation depth explicitly:
import numpy as np
import xarray as xr
import pandera.xarray as pa
from pandera.config import ValidationDepth, config_context
schema = pa.DataArraySchema(
dtype=np.float64,
dims=("x",),
checks=pa.Check(lambda da: float(da.min()) >= 0),
)
da = xr.DataArray(np.ones(5), dims="x")
with config_context(validation_depth=ValidationDepth.SCHEMA_AND_DATA):
schema.validate(da)
Or set the environment variable before running your program:
export PANDERA_VALIDATION_DEPTH=SCHEMA_AND_DATA
Resolution order¶
get_validation_depth() resolves the depth
in this order:
Active
config_context(validation_depth=...)— highest priority.Global config (
PANDERA_VALIDATION_DEPTHenv var orPanderaConfig.validation_depth).Per-object default —
SCHEMA_ONLYfor chunked data,SCHEMA_AND_DATAfor eager data.
Disabling validation¶
Set PANDERA_VALIDATION_ENABLED=false (env var) or use
config_context(validation_enabled=False) to make validate() a no-op that
returns the input unchanged:
with config_context(validation_enabled=False):
bad_da = xr.DataArray([-999], dims="z", name="wrong")
result = schema.validate(bad_da)
print(f"Validation skipped, returned: {result.values}")
Validation skipped, returned: [-999]
See also¶
Dask and Duck Arrays — Dask integration,
chunked,array_type, and lazy validationChecks and Parsers — checks, parsers, and lazy validation
Decorators —
check_input,check_output,check_io, andcheck_typesDataArray Schemas —
DataArraySchemadetailsDataset Schemas —
DatasetSchemadetailsData Models — class-based
DataArrayModel/DatasetModelXarray — full API reference for all xarray classes
Configuration — global
ValidationDepth,ValidationScope, env vars