Note
Go to the end to download the full example code.
Interval Dataο
This example adds a second dimension. The second dimension is an interval, of the form interval_from, interval_to. It is also known as binned data, where each βbinβ is bounded between and upper and lower limit.
An interval is relevant in geology, when analysing drill hole data.
Intervals are also encountered in metallurgy, but in that discipline they are often called fractions, e.g. size fractions. In that case the typical nomenclature is size_retained, size passing, since the data originates from a sieve stack.
import logging
import pandas as pd
import plotly.io
from matplotlib import pyplot as plt
from elphick.geomet import Sample, IntervalSample
from elphick.geomet.data.downloader import Downloader
from elphick.geomet.utils.pandas import weight_average
import plotly.graph_objects as go
logging.basicConfig(level=logging.INFO,
format='%(asctime)s %(levelname)s %(module)s - %(funcName)s: %(message)s',
datefmt='%Y-%m-%dT%H:%M:%S%z',
)
Create a MassComposition objectο
We get some demo data in the form of a pandas DataFrame We create this object as 1D based on the pandas index
iron_ore_sample_data: pd.DataFrame = Downloader().load_data(datafile='iron_ore_sample_A072391.zip', show_report=False)
df_data: pd.DataFrame = iron_ore_sample_data
df_data.head()
obj_mc: Sample = Sample(df_data, name='Drill program')
obj_mc
<elphick.geomet.sample.Sample object at 0x7f8eac97aa40>
obj_mc.aggregate
Use the normal pandas groupby-apply as needed. Here we leverage the weight_average function from utils.pandas
hole_average: pd.DataFrame = obj_mc.data.groupby('DHID').apply(weight_average)
hole_average
/home/runner/work/geometallurgy/geometallurgy/examples/02_interval_sample/01_interval_sample.py:58: DeprecationWarning:
DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning.
We will now make a 2D dataset using DHID and the intervals.
df_data['DHID'] = df_data['DHID'].astype('category')
df_data = df_data.reset_index(drop=True).set_index(['DHID', 'interval_from', 'interval_to'])
obj_mc_2d: IntervalSample = IntervalSample(df_data, name='Drill program')
print(obj_mc_2d)
IntervalSample: Drill program
{'mass_wet': {0: 2029.6178076448032}, 'mass_dry': {0: 1981.688}, 'H2O': {0: 2.3615188763258583}, 'MgO': {0: 0.08051321903347046}, 'MnO': {0: 0.14921928174364477}, 'Al2O3': {0: 1.773585095131019}, 'P': {0: 0.044627670955266416}, 'Fe': {0: 60.443937895370006}, 'SiO2': {0: 2.827210176374888}, 'TiO2': {0: 0.06297808534945964}, 'CaO': {0: 0.12507133312610258}, 'Na2O': {0: 0.015876646576050316}, 'K2O': {0: 0.013163565606694896}}
obj_mc_2d.aggregate
obj_mc_2d.data.groupby('DHID').apply(weight_average, **{'mass_wet': 'mass_wet', 'moisture_column_name': 'H2O'})
View some plots
fig: go.Figure = obj_mc_2d.plot_parallel(color='DHID')
plotly.io.show(fig)
obj_mc_2d.query('DHID=="CBS02"').reset_index('DHID').plot_intervals(variables=['mass_dry', 'Fe', 'SiO2', 'Al2O3'],
cumulative=False)
Total running time of the script: (0 minutes 2.988 seconds)