Quick Start Guide
This page will describe the basic steps to use the package.
The package is designed to be used with pandera yaml schema files that
have been modified to include the metadata
key for each of the column entries.
A good way to create a yaml schema from a pandas dataframe is to use the pandera.infer_schema function.
You can add the following keys to the metadata key for each of your columns:
unit_of_measure
aliases
decimals
sentinel_values
category
calculation
import pandera-utils
processor = DataFrameMetaProcessor(schema)
Pre-process the dataframe to manage aliases, rounding and perform calculations.
processed_df = processor.preprocess(dataframe)
Finally, you can validate the dataframe using the schema.
processor.validate(processed_df)