parq_tools.utils.index_utils

Functions

dedup_index_parquet

Remove duplicate rows based on index columns from a Parquet file.

reindex_parquet

Reindex a sparse Parquet file to align with a new index, processing in chunks.

sort_parquet_file

Globally sort a Parquet file by the specified columns.

validate_index_alignment

Validates that the index columns are identical across all datasets.