FastAPI

new in 0.9.0

Since both FastAPI and Pandera integrates seamlessly with Pydantic, you can use the SchemaModel types to validate incoming or outgoing data with respect to your API endpoints.

Using SchemaModels to Validate Endpoint Inputs and Outputs

Suppose we want to process transactions, where each transaction has an id and cost. We can model this with a pandera schema model:

# pylint: skip-file
from typing import Optional

from pydantic import BaseModel, Field

import pandera as pa


class Transactions(pa.SchemaModel):
    id: pa.typing.Series[int]
    cost: pa.typing.Series[float] = pa.Field(ge=0, le=1000)

    class Config:
        coerce = True

Also suppose that we expect our endpoint to add a name to the transaction data:

class TransactionsOut(Transactions):
    id: pa.typing.Series[int]
    cost: pa.typing.Series[float]
    name: pa.typing.Series[str]

Let’s also assume that the output of the endpoint should be a list of dictionary records containing the named transactions data. We can do this easily with the to_format option in the schema model BaseConfig.

class TransactionsDictOut(TransactionsOut):
    class Config:
        to_format = "dict"
        to_format_kwargs = {"orient": "records"}

Note that the to_format_kwargs is a dictionary of key-word arguments to be passed into the respective pandas to_{format} method.

Next we’ll create a FastAPI app and define a /transactions/ POST endpoint:

from fastapi import FastAPI, File

app = FastAPI()

@app.post("/transactions/", response_model=DataFrame[TransactionsDictOut])
def create_transactions(transactions: DataFrame[Transactions]):
    output = transactions.assign(name="foo")
    ...  # do other stuff, e.g. update backend database with transactions
    return output

Reading File Uploads

Similar to the TransactionsDictOut example to convert dataframes to a particular format as an endpoint response, pandera also provides a from_format schema model configuration option to read a dataframe from a particular serialization format.

class TransactionsParquet(Transactions):
    class Config:
        from_format = "parquet"

Let’s also define a response model for the /file/ upload endpoint:

class TransactionsJsonOut(TransactionsOut):
    class Config:
        to_format = "json"
        to_format_kwargs = {"orient": "records"}

class ResponseModel(BaseModel):
    filename: str
    df: pa.typing.DataFrame[TransactionsJsonOut]

In the next example, we use the pandera UploadFile type to upload a parquet file to the /file/ POST endpoint and return a response containing the filename and the modified data in json format.

@app.post("/file/", response_model=ResponseModel)
def create_upload_file(
    file: UploadFile[DataFrame[TransactionsParquet]] = File(...),
):
    return {
        "filename": file.filename,
        "df": file.data.assign(name="foo"),
    }

Pandera’s UploadFile type is a subclass of FastAPI’s UploadFile but it exposes a .data property containing the pandera-validated dataframe.

Takeaway

With the FastAPI and Pandera integration, you can use Pandera SchemaModel types to validate the dataframe inputs and outputs of your FastAPI endpoints.