OMF Block Model to Parquet

A parquet file is a columnar storage format, enabling column by column reading and writing. This can be used to reduce memory consumption. This example demonstrates how to convert an OMF block model to a Parquet file.

Note

Presently there is no low-memory option for this method, so it is not suitable for very large files. As such it offers no advantage over the standard Pandas method for saving to Parquet. However, this is the first step in a series of methods that will allow for more efficient handling of large files.

import logging
from pathlib import Path

import pandas as pd

from omfpandas import OMFDataConverter, OMFPandasReader

Instantiate

Create the object OMFPandas with the path to the OMF file.

logging.basicConfig(level=logging.INFO,
                    format='%(asctime)s %(levelname)s %(module)s - %(funcName)s: %(message)s',
                    datefmt='%Y-%m-%dT%H:%M:%S%z')
test_omf_path: Path = Path('./../assets/v2/test_file.omf')
omf_converter: OMFDataConverter = OMFDataConverter(filepath=test_omf_path)

# Display the head of the original block model
blocks: pd.DataFrame = OMFPandasReader(filepath=test_omf_path).read_blockmodel(blockmodel_name='vol')
print("Original DataFrame:")
blocks.head()
Original DataFrame:
random attr
x y z dx dy dz
10.5 10.5 -9.5 1.0 1.0 1.0 0.727986
11.5 10.5 -9.5 1.0 1.0 1.0 0.277389
12.5 10.5 -9.5 1.0 1.0 1.0 0.351741
13.5 10.5 -9.5 1.0 1.0 1.0 0.999272
14.5 10.5 -9.5 1.0 1.0 1.0 0.495092


Convert

View the elements in the OMF file first.

print(omf_converter.elements)
{'Random Points': 'PointSet', 'Random Line': 'LineSet', 'trisurf': 'Surface', 'gridsurf': 'TensorGridSurface', 'vol': 'TensorGridBlockModel'}

Convert ‘Block Model’ to a Parquet file.

omf_converter.blockmodel_to_parquet(blockmodel_name='vol', parquet_filepath=Path('blocks.parquet'),
                                    allow_overwrite=True)

Load the Parquet

Reload the Parquet file and display the head.

blocks_2: pd.DataFrame = pd.read_parquet('blocks.parquet')
print("Reloaded DataFrame:")
blocks_2.head()
Reloaded DataFrame:
random attr
x y z dx dy dz
10.5 10.5 -9.5 1.0 1.0 1.0 0.727986
11.5 10.5 -9.5 1.0 1.0 1.0 0.277389
12.5 10.5 -9.5 1.0 1.0 1.0 0.351741
13.5 10.5 -9.5 1.0 1.0 1.0 0.999272
14.5 10.5 -9.5 1.0 1.0 1.0 0.495092


Validate

Assert that the original DataFrame and the reloaded DataFrame are equivalent

assert blocks.equals(blocks_2), "The original DataFrame and the reloaded DataFrame are not equivalent."

Total running time of the script: (0 minutes 0.066 seconds)

Gallery generated by Sphinx-Gallery