Reblocking (Up/Down Sampling)#
Reblocking changes the block size of a regular model:
Upsampling creates a finer grid (smaller blocks).
Downsampling creates a coarser grid (larger blocks).
Configuration is explicit per attribute. This avoids accidental defaults when continuous and class-like attributes are mixed in one model.
Note
Not all upsampling methods are interpolative. mode, nearest, and
parent are class-preserving assignment methods.
Upsampling#
Use parq_blockmodel.blockmodel.ParquetBlockModel.upsample() with an
upsample_config mapping every attribute to a method.
Supported methods are:
linear: continuous interpolation (for continuous numeric attributes)nearest: nearest-source assignment (class-safe)mode: neighborhood class mode with deterministic tie-break (lowest code/value)parent: inherit value directly from the parent block (exact replication)
Typical mixed configuration:
upsampled = pbm.upsample(
new_block_size=(0.5, 0.5, 0.5),
upsample_config={
"grade": "linear", # continuous
"density": "linear", # continuous
"rock_type": "mode", # categorical / class-like
"domain_code": "parent", # integer-coded classes
},
)
If any attribute is omitted from upsample_config, upsampling fails
immediately with a clear error.
Downsampling#
Use parq_blockmodel.blockmodel.ParquetBlockModel.downsample() with an
aggregation_config mapping attributes to method dictionaries.
Common methods include:
meansumweighted_mean(requiresbasis)mode(class-like attributes)
Typical mixed configuration:
downsampled = pbm.downsample(
new_block_size=(2.0, 2.0, 2.0),
aggregation_config={
"grade": {"method": "weighted_mean", "basis": "dry_mass"},
"dry_mass": {"method": "sum"},
"volume": {"method": "sum"},
"rock_type": {"method": "mode"},
"domain_code": {"method": "mode"},
},
)
Using schema-calculated aggregation inputs#
Aggregation inputs can come from schema df-eval metadata, not only persisted
parquet columns. This is useful when a weighted basis (or an aggregated target)
is derived from other attributes.
Example: tonnes and contained_metal are calculated in the schema, then
used in downsampling:
downsampled = pbm.downsample(
new_block_size=(2.0, 2.0, 2.0),
aggregation_config={
"grade": {"method": "mean"},
"contained_metal": {"method": "weighted_mean", "basis": "tonnes"},
},
)
In this call:
tonnesis used as the weighted basis even if it is not stored on disk.contained_metalcan be aggregated even if it is only schema-defined.Aggregation math stays unchanged; only input materialization differs.
For regular models, basis: "volume" can also be used without persisting a
volume column. volume is available as a built-in calculated column from
geometry (pbm.geometry.block_volume).
Handling partially-filled blocks with fill_ratio#
When downsampling sparse or partially-filled block models, use the fill_ratio
key to normalize aggregation by the fraction of blocks actually occupied.
This is useful if child blocks in a region are only partly filled (e.g., from surface-only meshes or sparse models), and you want to avoid underestimating aggregated values.
Example: downsampling a grade with fill-aware weighting:
downsampled = pbm.downsample(
new_block_size=(2.0, 2.0, 2.0),
aggregation_config={
"grade": {"method": "weighted_mean", "basis": "mass", "fill_ratio": "fill_factor"},
"mass": {"method": "sum", "fill_ratio": "fill_factor"},
"fill_factor": {"method": "mean"}, # required: aggregate the fill ratio itself
},
)
How it works:
fill_factormust be an attribute (array) with values in[0, 1]that indicate the fraction of child blocks that are occupied in each coarse block.For
sumwithfill_ratio: "fill_factor":result = (sum of values) / mean(fill_factor).For
weighted_meanwithfill_ratio: "fill_factor":result = (sum of (values * basis)) / (mean(fill_factor) * sum of basis).
Only one fill_ratio attribute may be used across all aggregations in
a single downsampling call. If you omit fill_ratio, aggregations work
on all available child blocks without normalization.
Choosing methods#
Use this rule of thumb:
Continuous attributes (grade, density, porosity):
linearfor upsample,mean/weighted_meanfor downsample.Class-like attributes (categorical labels, integer-coded domains):
parent,nearest, ormodefor upsample,modefor downsample.
IDW interpolation#
Inverse Distance Weighting (IDW) is not currently implemented.
In this package, adding IDW for upsampling is likely non-trivial because it would need:
method wiring across API/config validation,
deterministic handling for boundary and missing data,
behavior definition for integer and class-like attributes,
new regression and integration tests.
Given that, IDW is best treated as a separate scoped feature after the class-safe upsampling behavior is finalized.