pandera.api.pyspark.container.DataFrameSchema.__call__ΒΆ
- DataFrameSchema.__call__(dataframe, head=None, tail=None, sample=None, random_state=None, lazy=True, inplace=False)[source]ΒΆ
Alias for
DataFrameSchema.validate()
method.- Parameters:
dataframe (
DataFrame
) β DataFrame object i.e. the dataframe to be validated.head (int) β Not used since spark has no concept of head or tail.
tail (int) β Not used since spark has no concept of head or tail.
sample (
Optional
[int
]) β validate a random sample of n% rows. Value ranges from 0-1, for example 10% rows can be sampled using setting value as 0.1. refer below documentation. https://spark.apache.org/docs/3.1.2/api/python/reference/api/pyspark.sql.DataFrame.sample.htmllazy (
bool
) β if True, lazily evaluates dataframe against all validation checks and raises aSchemaErrors
. Otherwise, raiseSchemaError
as soon as one occurs.inplace (
bool
) β if True, applies coercion to the object of validation, otherwise creates a copy of the data.