parq_tools.utils.index_utils.validate_index_alignment
- parq_tools.utils.index_utils.validate_index_alignment(datasets, index_columns, batch_size=100000)[source]
Validates that the index columns are identical across all datasets.
- Parameters:
datasets (List[ds.Dataset]) – List of PyArrow datasets to validate.
index_columns (List[str]) – List of index column names to compare.
batch_size (int, optional) – Number of rows per batch to process.
- Raises:
ValueError – If the index columns are not identical across datasets.
- Return type:
None