parq_tools.utils.index_utils.reindex_parquet
- parq_tools.utils.index_utils.reindex_parquet(sparse_parquet_path, output_path, new_index, chunk_size=100000, sort_after_reindex=True)[source]
Reindex a sparse Parquet file to align with a new index, processing in chunks.
- Parameters:
sparse_parquet_path (Path) – Path to the sparse Parquet file.
output_path (Path) – Path to save the re-indexed Parquet file.
new_index (pa.Table) – New index as a PyArrow table.
chunk_size (int) – Number of rows to process per chunk.
sort_after_reindex (bool) – Whether to sort the output after reindexing. Defaults to True.
- Return type:
None