parq_tools.utils.index_utils.validate_index_alignment

parq_tools.utils.index_utils.validate_index_alignment(datasets, index_columns, batch_size=100000)[source]

Validates that the index columns are identical across all datasets.

Parameters:
  • datasets (List[ds.Dataset]) – List of PyArrow datasets to validate.

  • index_columns (List[str]) – List of index column names to compare.

  • batch_size (int, optional) – Number of rows per batch to process.

Raises:

ValueError – If the index columns are not identical across datasets.

Return type:

None