.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/parallel_coordinates.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_parallel_coordinates.py: ==================== Parallel Coordinates ==================== Parallel coordinate plots are very useful for Exploratory Data Analysis (EDA). Typically the target variable will be colored, since it is the variable of most interest, though this is optional. The interactive nature of plotly is a real asset for this particular plot. Records/samples can be highlighted by clicking and dragging the mouse vertically at a given axis for a variable (feature or target). Multiple selections are possible. Single clicking a selection will remove it. .. GENERATED FROM PYTHON SOURCE LINES 15-22 .. code-block:: default import pandas as pd import plotly.io as pio from sklearn.datasets import load_diabetes, load_wine from elphick.sklearn_viz.features import plot_parallel_coordinates .. GENERATED FROM PYTHON SOURCE LINES 23-25 Load Classification Data ------------------------ .. GENERATED FROM PYTHON SOURCE LINES 25-31 .. code-block:: default wine = load_wine(as_frame=True) X, y = wine.data, wine.target.rename('target') df = pd.concat([X, y], axis=1) df .. raw:: html
alcohol malic_acid ash alcalinity_of_ash magnesium total_phenols flavanoids nonflavanoid_phenols proanthocyanins color_intensity hue od280/od315_of_diluted_wines proline target
0 14.23 1.71 2.43 15.6 127.0 2.80 3.06 0.28 2.29 5.64 1.04 3.92 1065.0 0
1 13.20 1.78 2.14 11.2 100.0 2.65 2.76 0.26 1.28 4.38 1.05 3.40 1050.0 0
2 13.16 2.36 2.67 18.6 101.0 2.80 3.24 0.30 2.81 5.68 1.03 3.17 1185.0 0
3 14.37 1.95 2.50 16.8 113.0 3.85 3.49 0.24 2.18 7.80 0.86 3.45 1480.0 0
4 13.24 2.59 2.87 21.0 118.0 2.80 2.69 0.39 1.82 4.32 1.04 2.93 735.0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
173 13.71 5.65 2.45 20.5 95.0 1.68 0.61 0.52 1.06 7.70 0.64 1.74 740.0 2
174 13.40 3.91 2.48 23.0 102.0 1.80 0.75 0.43 1.41 7.30 0.70 1.56 750.0 2
175 13.27 4.28 2.26 20.0 120.0 1.59 0.69 0.43 1.35 10.20 0.59 1.56 835.0 2
176 13.17 2.59 2.37 20.0 120.0 1.65 0.68 0.53 1.46 9.30 0.60 1.62 840.0 2
177 14.13 4.10 2.74 24.5 96.0 2.05 0.76 0.56 1.35 9.20 0.61 1.60 560.0 2

178 rows × 14 columns



.. GENERATED FROM PYTHON SOURCE LINES 32-34 Plot Classification Data ------------------------ .. GENERATED FROM PYTHON SOURCE LINES 34-39 .. code-block:: default fig = plot_parallel_coordinates(df, color=y.name) # noinspection PyTypeChecker pio.show(fig) .. raw:: html :file: images/sphx_glr_parallel_coordinates_001.html .. GENERATED FROM PYTHON SOURCE LINES 40-41 The target is optional. If the plot is too dense, then consider sampling as demonstrated. .. GENERATED FROM PYTHON SOURCE LINES 41-45 .. code-block:: default fig = plot_parallel_coordinates(df.sample(frac=0.5)) fig .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 46-48 Load Regression Data -------------------- .. GENERATED FROM PYTHON SOURCE LINES 48-54 .. code-block:: default diabetes = load_diabetes(as_frame=True, scaled=False) X, y = diabetes.data, diabetes.target.rename('target') df = pd.concat([X, y], axis=1) df .. raw:: html
age sex bmi bp s1 s2 s3 s4 s5 s6 target
0 59.0 2.0 32.1 101.00 157.0 93.2 38.0 4.00 4.8598 87.0 151.0
1 48.0 1.0 21.6 87.00 183.0 103.2 70.0 3.00 3.8918 69.0 75.0
2 72.0 2.0 30.5 93.00 156.0 93.6 41.0 4.00 4.6728 85.0 141.0
3 24.0 1.0 25.3 84.00 198.0 131.4 40.0 5.00 4.8903 89.0 206.0
4 50.0 1.0 23.0 101.00 192.0 125.4 52.0 4.00 4.2905 80.0 135.0
... ... ... ... ... ... ... ... ... ... ... ...
437 60.0 2.0 28.2 112.00 185.0 113.8 42.0 4.00 4.9836 93.0 178.0
438 47.0 2.0 24.9 75.00 225.0 166.0 42.0 5.00 4.4427 102.0 104.0
439 60.0 2.0 24.9 99.67 162.0 106.6 43.0 3.77 4.1271 95.0 132.0
440 36.0 1.0 30.0 95.00 201.0 125.2 42.0 4.79 5.1299 85.0 220.0
441 36.0 1.0 19.6 71.00 250.0 133.2 97.0 3.00 4.5951 92.0 57.0

442 rows × 11 columns



.. GENERATED FROM PYTHON SOURCE LINES 55-57 Plot Regression Data -------------------- .. GENERATED FROM PYTHON SOURCE LINES 57-61 .. code-block:: default fig = plot_parallel_coordinates(df, color=y.name) fig .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 62-63 Categorical data is supported .. GENERATED FROM PYTHON SOURCE LINES 63-67 .. code-block:: default df['sex'] = df['sex'].map({1: 'Male', 2: 'Female'}).astype('category') fig = plot_parallel_coordinates(df.sample(frac=0.5), color=y.name) fig .. raw:: html


.. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 2.912 seconds) .. _sphx_glr_download_auto_examples_parallel_coordinates.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: parallel_coordinates.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: parallel_coordinates.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_