.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/examples/02_interval_sample/06_resampling_interval_data_2d.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_examples_02_interval_sample_06_resampling_interval_data_2d.py: Resampling 2D Interval Data =========================== The Sink Float metallurgical test splits/fractionates samples by density. The density fraction is often conducted by size fraction, resulting in 2D fractionation (interval) data. This example demonstrates how to resample 2D interval data using the IntervalSample object. .. GENERATED FROM PYTHON SOURCE LINES 11-24 .. code-block:: Python import logging # noinspection PyUnresolvedReferences import numpy as np import pandas as pd import plotly.io from elphick.geomet import IntervalSample from elphick.geomet.datasets import datasets from elphick.geomet.utils.pandas import MeanIntervalIndex from elphick.geomet.utils.size import sizes_all .. GENERATED FROM PYTHON SOURCE LINES 25-29 .. code-block:: Python logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)s %(module)s - %(funcName)s: %(message)s', datefmt='%Y-%m-%dT%H:%M:%S%z') .. GENERATED FROM PYTHON SOURCE LINES 30-34 Load Data --------- We load some real data. .. GENERATED FROM PYTHON SOURCE LINES 35-39 .. code-block:: Python df_data: pd.DataFrame = datasets.load_nordic_iron_ore_sink_float() df_data .. raw:: html

	size_retained	size_passing	density_lo	density_hi	mass_pct	Fe	SiO2	P	TiO2	V
0	1.000	NaN	NaN	NaN	0.4	22.7	50.9	0.042	0.170	0.0049
1	0.100	1.000	NaN	NaN	67.1	27.3	47.7	0.107	0.178	0.0062
2	0.063	0.100	NaN	NaN	12.7	18.0	57.8	0.440	0.200	0.0056
3	0.040	0.063	NaN	NaN	8.2	16.9	57.1	0.610	0.235	0.0057
4	0.000	0.040	NaN	NaN	11.6	19.4	51.6	0.650	0.310	0.0072
5	0.100	NaN	NaN	2.7	54.2	1.2	1.6	0.180	0.084	0.0030
6	0.100	NaN	2.7	3.3	9.7	16.6	42.6	0.980	0.380	0.0100
7	0.100	NaN	3.3	NaN	36.1	68.0	78.8	0.033	0.285	0.0120
8	0.063	0.100	NaN	2.7	24.0	1.2	79.5	0.015	0.060	0.0070
9	0.063	0.100	2.7	3.3	11.9	10.2	54.0	2.320	0.280	0.0080
10	0.063	0.100	3.3	NaN	64.1	67.1	1.4	0.174	0.530	0.0020
11	0.040	0.063	NaN	2.7	76.6	3.1	71.3	0.850	0.145	0.0130
12	0.040	0.063	2.7	3.3	4.3	28.5	24.5	2.780	0.460	0.1100
13	0.040	0.063	3.3	NaN	19.1	68.6	0.6	0.069	0.480	0.0130

.. GENERATED FROM PYTHON SOURCE LINES 40-42 The dataset contains size x assay, plus size x density x assay data. We'll drop the size x assay data to leave the sink / float data. .. GENERATED FROM PYTHON SOURCE LINES 42-46 .. code-block:: Python df_sink_float: pd.DataFrame = df_data.dropna(subset=['density_lo', 'density_hi'], how='all').copy() df_sink_float .. raw:: html

	size_retained	size_passing	density_lo	density_hi	mass_pct	Fe	SiO2	P	TiO2	V
5	0.100	NaN	NaN	2.7	54.2	1.2	1.6	0.180	0.084	0.003
6	0.100	NaN	2.7	3.3	9.7	16.6	42.6	0.980	0.380	0.010
7	0.100	NaN	3.3	NaN	36.1	68.0	78.8	0.033	0.285	0.012
8	0.063	0.100	NaN	2.7	24.0	1.2	79.5	0.015	0.060	0.007
9	0.063	0.100	2.7	3.3	11.9	10.2	54.0	2.320	0.280	0.008
10	0.063	0.100	3.3	NaN	64.1	67.1	1.4	0.174	0.530	0.002
11	0.040	0.063	NaN	2.7	76.6	3.1	71.3	0.850	0.145	0.013
12	0.040	0.063	2.7	3.3	4.3	28.5	24.5	2.780	0.460	0.110
13	0.040	0.063	3.3	NaN	19.1	68.6	0.6	0.069	0.480	0.013

.. GENERATED FROM PYTHON SOURCE LINES 47-48 We will fill some nan values with assumptions .. GENERATED FROM PYTHON SOURCE LINES 48-52 .. code-block:: Python df_sink_float['size_passing'].fillna(1.0, inplace=True) df_sink_float['density_lo'].fillna(1.5, inplace=True) df_sink_float['density_hi'].fillna(5.0, inplace=True) .. GENERATED FROM PYTHON SOURCE LINES 53-54 Check the mass_pct by size .. GENERATED FROM PYTHON SOURCE LINES 54-62 .. code-block:: Python mass_check: pd.DataFrame = df_sink_float[['size_passing', 'size_retained', 'mass_pct']].groupby( ['size_passing', 'size_retained']).sum() # check that all are 100 assert np.all(mass_check['mass_pct'] == 100) mass_check .. raw:: html

		mass_pct
size_passing	size_retained
0.063	0.040	100.0
0.100	0.063	100.0
1.000	0.100	100.0

.. GENERATED FROM PYTHON SOURCE LINES 63-65 This indicates that the mass_pct column is actually a density_mass_pct column. We'll rename that but also need to get the size_mass_pct values for those sizes from the size dataset .. GENERATED FROM PYTHON SOURCE LINES 65-84 .. code-block:: Python df_sink_float.rename(columns={'mass_pct': 'density_mass_pct'}, inplace=True) df_size: pd.DataFrame = df_data.loc[np.all(df_data[['density_lo', 'density_hi']].isna(), axis=1), :].copy() df_size.dropna(how='all', axis=1, inplace=True) assert df_size['mass_pct'].sum() == 100 size_pairs = set(list((round(r, 5), round(p, 5)) for r, p in zip(df_sink_float['size_retained'].values, df_sink_float['size_passing'].values))) for r, p in size_pairs: df_sink_float.loc[(df_sink_float['size_retained'] == r) & (df_sink_float['size_passing'] == p), 'size_mass_pct'] = \ df_size.loc[(df_size['size_retained'] == r) & (df_size['size_passing'] == p), 'mass_pct'].values[0] # relocate the size_mass_pct column to the correct position, after size_passing df_sink_float.insert(2, df_sink_float.columns[-1], df_sink_float.pop(df_sink_float.columns[-1])) # add the mass_pct column df_sink_float.insert(loc=6, column='mass_pct', value=df_sink_float['density_mass_pct'] * df_sink_float['size_mass_pct'] / 100) df_sink_float .. raw:: html

	size_retained	size_passing	size_mass_pct	density_lo	density_hi	density_mass_pct	mass_pct	Fe	SiO2	P	TiO2	V
5	0.100	1.000	67.1	1.5	2.7	54.2	36.3682	1.2	1.6	0.180	0.084	0.003
6	0.100	1.000	67.1	2.7	3.3	9.7	6.5087	16.6	42.6	0.980	0.380	0.010
7	0.100	1.000	67.1	3.3	5.0	36.1	24.2231	68.0	78.8	0.033	0.285	0.012
8	0.063	0.100	12.7	1.5	2.7	24.0	3.0480	1.2	79.5	0.015	0.060	0.007
9	0.063	0.100	12.7	2.7	3.3	11.9	1.5113	10.2	54.0	2.320	0.280	0.008
10	0.063	0.100	12.7	3.3	5.0	64.1	8.1407	67.1	1.4	0.174	0.530	0.002
11	0.040	0.063	8.2	1.5	2.7	76.6	6.2812	3.1	71.3	0.850	0.145	0.013
12	0.040	0.063	8.2	2.7	3.3	4.3	0.3526	28.5	24.5	2.780	0.460	0.110
13	0.040	0.063	8.2	3.3	5.0	19.1	1.5662	68.6	0.6	0.069	0.480	0.013

.. GENERATED FROM PYTHON SOURCE LINES 85-87 Create MeanIntervalIndexes -------------------------- .. GENERATED FROM PYTHON SOURCE LINES 87-102 .. code-block:: Python size_intervals = pd.arrays.IntervalArray.from_arrays(df_sink_float['size_retained'], df_sink_float['size_passing'], closed='left') size_index = MeanIntervalIndex(size_intervals) size_index.name = 'size' density_intervals = pd.arrays.IntervalArray.from_arrays(df_sink_float['density_lo'], df_sink_float['density_hi'], closed='left') density_index = MeanIntervalIndex(density_intervals) density_index.name = 'density' df_sink_float.index = pd.MultiIndex.from_arrays([size_index, density_index]) df_sink_float.drop(columns=['size_retained', 'size_passing', 'density_lo', 'density_hi'], inplace=True) df_sink_float .. raw:: html

		size_mass_pct	density_mass_pct	mass_pct	Fe	SiO2	P	TiO2	V
size	density
[0.1, 1.0)	[1.5, 2.7)	67.1	54.2	36.3682	1.2	1.6	0.180	0.084	0.003
	[2.7, 3.3)	67.1	9.7	6.5087	16.6	42.6	0.980	0.380	0.010
	[3.3, 5.0)	67.1	36.1	24.2231	68.0	78.8	0.033	0.285	0.012
[0.063, 0.1)	[1.5, 2.7)	12.7	24.0	3.0480	1.2	79.5	0.015	0.060	0.007
	[2.7, 3.3)	12.7	11.9	1.5113	10.2	54.0	2.320	0.280	0.008
	[3.3, 5.0)	12.7	64.1	8.1407	67.1	1.4	0.174	0.530	0.002
[0.04, 0.063)	[1.5, 2.7)	8.2	76.6	6.2812	3.1	71.3	0.850	0.145	0.013
	[2.7, 3.3)	8.2	4.3	0.3526	28.5	24.5	2.780	0.460	0.110
	[3.3, 5.0)	8.2	19.1	1.5662	68.6	0.6	0.069	0.480	0.013

.. GENERATED FROM PYTHON SOURCE LINES 103-105 Create a 2D IntervalSample -------------------------- .. GENERATED FROM PYTHON SOURCE LINES 105-113 .. code-block:: Python interval_sample = IntervalSample(df_sink_float, name='SINK_FLOAT', moisture_in_scope=False, mass_dry_var='mass_pct') print(interval_sample.is_2d_grid()) print(interval_sample.is_rectilinear_grid) fig = interval_sample.plot_heatmap(components=['mass_pct']) plotly.io.show(fig) .. raw:: html :file: images/sphx_glr_06_resampling_interval_data_2d_001.html .. rst-class:: sphx-glr-script-out .. code-block:: none False False .. GENERATED FROM PYTHON SOURCE LINES 114-117 Upsample -------- We will upsample the data to a new grid .. GENERATED FROM PYTHON SOURCE LINES 117-127 .. code-block:: Python size_grid = sorted([s for s in sizes_all if s >= size_index.left.min() and s <= size_index.right.max()]) density_grid = np.arange(1.5, 5.1, 0.1) new_grids: dict = {'size': size_grid, 'density': density_grid} upsampled: IntervalSample = interval_sample.resample_2d(interval_edges=new_grids, precision=3) pd.testing.assert_frame_equal(interval_sample.aggregate.reset_index(drop=True), upsampled.aggregate.reset_index(drop=True)) fig = upsampled.plot_heatmap(components=['mass_pct']) plotly.io.show(fig) .. raw:: html :file: images/sphx_glr_06_resampling_interval_data_2d_002.html .. rst-class:: sphx-glr-script-out .. code-block:: none (38, 36, 6) .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.781 seconds) .. _sphx_glr_download_auto_examples_examples_02_interval_sample_06_resampling_interval_data_2d.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: 06_resampling_interval_data_2d.ipynb <06_resampling_interval_data_2d.ipynb>` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: 06_resampling_interval_data_2d.py <06_resampling_interval_data_2d.py>` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_