Basic Engine Usage#

This example walks through core df_eval.Engine patterns on a small in-memory DataFrame.

It covers:

  • Creating an Engine

  • Evaluating a single expression with Engine.evaluate()

  • Defining multiple derived columns with Engine.apply_schema()

  • Using a few built-in safe functions

  • Evaluating multiple independent expressions with Engine.evaluate_many()

import pandas as pd

from df_eval import Engine

Build Input Data#

df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
df
a b
0 1 4
1 2 5
2 3 6


Create Engine, Single Expression#

engine = Engine()

single_result = engine.evaluate(df, "a + b")
single_result
0    5
1    7
2    9
dtype: int64

Schema-Driven Derived Columns#

schema = {
    "sum": "a + b",
    "product": "a * b",
    "ratio": "a / b",
    "safe_ratio": "safe_divide(a, b)",
    "ratio_rounded": "round(a / b, 2)",
    "ratio_ceiling": "ceil(a / b)",
    "ratio_floor": "floor(a / b)",
}

df_with_derived = engine.apply_schema(df, schema)
df_with_derived
a b product ratio ratio_ceiling ratio_floor ratio_rounded safe_ratio sum
0 1 4 4 0.25 1.0 0.0 0.25 0.25 5
1 2 5 10 0.40 1.0 0.0 0.40 0.40 7
2 3 6 18 0.50 1.0 0.0 0.50 0.50 9


Controlling Output Types with dtypes#

typed_schema = {
    "float_sum": "a + b",
    "int_product": "a * b",
}

typed_result = engine.apply_schema(
    df,
    typed_schema,
    dtypes={"float_sum": "float64", "int_product": "int32"},
)
typed_result.dtypes
a                int64
b                int64
float_sum      float64
int_product      int32
dtype: object

Schema Spec with decimals#

pricing_df = pd.DataFrame({"price": [12.3456, 99.9949, 0.3333]})

rounding_schema = {
    "price_2dp": {"expr": "price", "decimals": 2},
}

rounded_result = engine.apply_schema(pricing_df, rounding_schema)
rounded_result
price price_2dp
0 12.3456 12.35
1 99.9949 99.99
2 0.3333 0.33


Evaluating Multiple Independent Expressions#

expressions = {
    "sum": "a + b",
    "product": "a * b",
    "avg": "(a + b) / 2",
}

many_results = engine.evaluate_many(df, expressions)

{
    name: series.tolist() for name, series in many_results.items()
}
{'a': [1, 2, 3], 'b': [4, 5, 6], 'avg': [2.5, 3.5, 4.5], 'product': [4, 10, 18], 'sum': [5, 7, 9]}

Total running time of the script: (0 minutes 0.022 seconds)

Gallery generated by Sphinx-Gallery