Dataset statistics
| Number of variables | 9 |
|---|---|
| Number of observations | 130844 |
| Missing cells | 215065 |
| Missing cells (%) | 18.3% |
| Total size in memory | 9.0 MiB |
| Average record size in memory | 72.0 B |
Variable types
| Text | 2 |
|---|---|
| Numeric | 7 |
Dataset
| Description | A dataset from the WAMEX database. |
|---|---|
| URL | https://www.dmp.wa.gov.au/WAMEX-Minerals-Exploration-1476.aspx |
Gamma has 27716 (21.2%) missing values | Missing |
MagSusc has 39182 (29.9%) missing values | Missing |
Caliper has 74081 (56.6%) missing values | Missing |
Density has 74086 (56.6%) missing values | Missing |
Gamma is highly skewed (γ1 = -321.1155986) | Skewed |
MagSusc has 19506 (14.9%) zeros | Zeros |
Caliper has 19529 (14.9%) zeros | Zeros |
Density has 19534 (14.9%) zeros | Zeros |
Reproduction
| Analysis started | 2023-07-19 23:04:37.330836 |
|---|---|
| Analysis finished | 2023-07-19 23:04:37.799245 |
| Duration | 0.47 seconds |
| Software version | ydata-profiling vv4.3.1 |
| Download configuration | config.json |
HOLEID
Text
| Distinct | 312 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1022.3 KiB |
Length
| Max length | 7 |
|---|---|
| Median length | 6 |
| Mean length | 6.369218306 |
| Min length | 6 |
Characters and Unicode
| Total characters | 833374 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | CB0002 |
|---|---|
| 2nd row | CB0002 |
| 3rd row | CB0002 |
| 4th row | CB0002 |
| 5th row | CB0002 |
| Value | Count | Frequency (%) |
| cc0016 | 1670 | 1.3% |
| cc0919 | 1211 | 0.9% |
| cc0026 | 1137 | 0.9% |
| cc0042 | 1076 | 0.8% |
| ccd0026 | 1073 | 0.8% |
| cc0065 | 1011 | 0.8% |
| cc0032 | 1005 | 0.8% |
| cc0048 | 1005 | 0.8% |
| cc0523 | 1003 | 0.8% |
| ccd0001 | 996 | 0.8% |
| Other values (302) | 119657 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 243487 | |
| C | 243428 | |
| 1 | 49822 | 6.0% |
| D | 48310 | 5.8% |
| 4 | 36789 | 4.4% |
| 2 | 35712 | 4.3% |
| 3 | 30341 | 3.6% |
| 5 | 28271 | 3.4% |
| 6 | 28229 | 3.4% |
| 7 | 25663 | 3.1% |
| Other values (3) | 63322 | 7.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 523376 | |
| Uppercase Letter | 309998 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 243487 | |
| 1 | 49822 | 9.5% |
| 4 | 36789 | 7.0% |
| 2 | 35712 | 6.8% |
| 3 | 30341 | 5.8% |
| 5 | 28271 | 5.4% |
| 6 | 28229 | 5.4% |
| 7 | 25663 | 4.9% |
| 8 | 22983 | 4.4% |
| 9 | 22079 | 4.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 243428 | |
| D | 48310 | 15.6% |
| B | 18260 | 5.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 523376 | |
| Latin | 309998 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 243487 | |
| 1 | 49822 | 9.5% |
| 4 | 36789 | 7.0% |
| 2 | 35712 | 6.8% |
| 3 | 30341 | 5.8% |
| 5 | 28271 | 5.4% |
| 6 | 28229 | 5.4% |
| 7 | 25663 | 4.9% |
| 8 | 22983 | 4.4% |
| 9 | 22079 | 4.2% |
Latin
| Value | Count | Frequency (%) |
| C | 243428 | |
| D | 48310 | 15.6% |
| B | 18260 | 5.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 833374 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 243487 | |
| C | 243428 | |
| 1 | 49822 | 6.0% |
| D | 48310 | 5.8% |
| 4 | 36789 | 4.4% |
| 2 | 35712 | 4.3% |
| 3 | 30341 | 3.6% |
| 5 | 28271 | 3.4% |
| 6 | 28229 | 3.4% |
| 7 | 25663 | 3.1% |
| Other values (3) | 63322 | 7.6% |
PROJECTCODE
Text
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1022.3 KiB |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 261688 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | CB |
|---|---|
| 2nd row | CB |
| 3rd row | CB |
| 4th row | CB |
| 5th row | CB |
| Value | Count | Frequency (%) |
| cc | 112584 | |
| cb | 18260 | 14.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| C | 243428 | |
| B | 18260 | 7.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 261688 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 243428 | |
| B | 18260 | 7.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 261688 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| C | 243428 | |
| B | 18260 | 7.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 261688 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| C | 243428 | |
| B | 18260 | 7.0% |
GEOLFROM
Real number (ℝ)
| Distinct | 5671 |
|---|---|
| Distinct (%) | 4.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 17.25823202 |
| Minimum | -0.1 |
|---|---|
| Maximum | 86.3 |
| Zeros | 144 |
| Zeros (%) | 0.1% |
| Negative | 1 |
| Negative (%) | < 0.1% |
| Memory size | 1022.3 KiB |
Quantile statistics
| Minimum | -0.1 |
|---|---|
| 5-th percentile | 2.4 |
| Q1 | 7.8 |
| median | 14.82 |
| Q3 | 23.8 |
| 95-th percentile | 41.0185 |
| Maximum | 86.3 |
| Range | 86.4 |
| Interquartile range (IQR) | 16 |
Descriptive statistics
| Standard deviation | 12.34705189 |
|---|---|
| Coefficient of variation (CV) | 0.7154297079 |
| Kurtosis | 1.683387354 |
| Mean | 17.25823202 |
| Median Absolute Deviation (MAD) | 7.7 |
| Skewness | 1.170599152 |
| Sum | 2258136.11 |
| Variance | 152.4496904 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 4.7 | 179 | 0.1% |
| 7.8 | 179 | 0.1% |
| 7.6 | 179 | 0.1% |
| 7.5 | 179 | 0.1% |
| 7.4 | 179 | 0.1% |
| 7.3 | 179 | 0.1% |
| 7.2 | 179 | 0.1% |
| 7.1 | 179 | 0.1% |
| 7 | 179 | 0.1% |
| 6.9 | 179 | 0.1% |
| Other values (5661) | 129054 |
| Value | Count | Frequency (%) |
| -0.1 | 1 | < 0.1% |
| 0 | 144 | |
| 0.02 | 3 | < 0.1% |
| 0.04 | 1 | < 0.1% |
| 0.06 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 86.3 | 1 | |
| 86.2 | 1 | |
| 86.1 | 1 | |
| 86 | 1 | |
| 85.9 | 1 |
GEOLTO
Real number (ℝ)
| Distinct | 5671 |
|---|---|
| Distinct (%) | 4.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 17.25823202 |
| Minimum | -0.1 |
|---|---|
| Maximum | 86.3 |
| Zeros | 144 |
| Zeros (%) | 0.1% |
| Negative | 1 |
| Negative (%) | < 0.1% |
| Memory size | 1022.3 KiB |
Quantile statistics
| Minimum | -0.1 |
|---|---|
| 5-th percentile | 2.4 |
| Q1 | 7.8 |
| median | 14.82 |
| Q3 | 23.8 |
| 95-th percentile | 41.0185 |
| Maximum | 86.3 |
| Range | 86.4 |
| Interquartile range (IQR) | 16 |
Descriptive statistics
| Standard deviation | 12.34705189 |
|---|---|
| Coefficient of variation (CV) | 0.7154297079 |
| Kurtosis | 1.683387354 |
| Mean | 17.25823202 |
| Median Absolute Deviation (MAD) | 7.7 |
| Skewness | 1.170599152 |
| Sum | 2258136.11 |
| Variance | 152.4496904 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 4.7 | 179 | 0.1% |
| 7.8 | 179 | 0.1% |
| 7.6 | 179 | 0.1% |
| 7.5 | 179 | 0.1% |
| 7.4 | 179 | 0.1% |
| 7.3 | 179 | 0.1% |
| 7.2 | 179 | 0.1% |
| 7.1 | 179 | 0.1% |
| 7 | 179 | 0.1% |
| 6.9 | 179 | 0.1% |
| Other values (5661) | 129054 |
| Value | Count | Frequency (%) |
| -0.1 | 1 | < 0.1% |
| 0 | 144 | |
| 0.02 | 3 | < 0.1% |
| 0.04 | 1 | < 0.1% |
| 0.06 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 86.3 | 1 | |
| 86.2 | 1 | |
| 86.1 | 1 | |
| 86 | 1 | |
| 85.9 | 1 |
PRIORITY
Real number (ℝ)
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.05812265 |
| Minimum | 1 |
|---|---|
| Maximum | 30 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1022.3 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 30 |
| 95-th percentile | 30 |
| Maximum | 30 |
| Range | 29 |
| Interquartile range (IQR) | 29 |
Descriptive statistics
| Standard deviation | 13.44013956 |
|---|---|
| Coefficient of variation (CV) | 1.336247333 |
| Kurtosis | -1.344232933 |
| Mean | 10.05812265 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0.8098071463 |
| Sum | 1316045 |
| Variance | 180.6373515 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=2)
| Value | Count | Frequency (%) |
| 1 | 89975 | |
| 30 | 40869 |
| Value | Count | Frequency (%) |
| 1 | 89975 | |
| 30 | 40869 |
| Value | Count | Frequency (%) |
| 30 | 40869 | |
| 1 | 89975 |
Gamma
Real number (ℝ)
MISSING  SKEWED 
| Distinct | 12342 |
|---|---|
| Distinct (%) | 12.0% |
| Missing | 27716 |
| Missing (%) | 21.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -250.2293953 |
| Minimum | -28074896.05 |
|---|---|
| Maximum | 37062.19 |
| Zeros | 1058 |
| Zeros (%) | 0.8% |
| Negative | 2746 |
| Negative (%) | 2.1% |
| Memory size | 1022.3 KiB |
Quantile statistics
| Minimum | -28074896.05 |
|---|---|
| 5-th percentile | 0.38 |
| Q1 | 8.7 |
| median | 21.72 |
| Q3 | 54.90625 |
| 95-th percentile | 136.6 |
| Maximum | 37062.19 |
| Range | 28111958.24 |
| Interquartile range (IQR) | 46.20625 |
Descriptive statistics
| Standard deviation | 87425.70791 |
|---|---|
| Coefficient of variation (CV) | -349.3822451 |
| Kurtosis | 103119.4914 |
| Mean | -250.2293953 |
| Median Absolute Deviation (MAD) | 16.69 |
| Skewness | -321.1155986 |
| Sum | -25805657.08 |
| Variance | 7643254404 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -999.25 | 2745 | 2.1% |
| 0 | 1058 | 0.8% |
| 8.7 | 678 | 0.5% |
| 17.3 | 505 | 0.4% |
| 26 | 432 | 0.3% |
| 34.7 | 329 | 0.3% |
| 13 | 302 | 0.2% |
| 16.5 | 289 | 0.2% |
| 43.3 | 260 | 0.2% |
| 52 | 247 | 0.2% |
| Other values (12332) | 96283 | |
| (Missing) | 27716 | 21.2% |
| Value | Count | Frequency (%) |
| -28074896.05 | 1 | < 0.1% |
| -999.25 | 2745 | |
| 0 | 1058 | 0.8% |
| 0.01 | 85 | 0.1% |
| 0.02 | 65 | < 0.1% |
| Value | Count | Frequency (%) |
| 37062.19 | 20 | |
| 30173.96 | 1 | < 0.1% |
| 17304.21 | 1 | < 0.1% |
| 13739.54 | 1 | < 0.1% |
| 9983.01 | 1 | < 0.1% |
MagSusc
Real number (ℝ)
MISSING  ZEROS 
| Distinct | 6038 |
|---|---|
| Distinct (%) | 6.6% |
| Missing | 39182 |
| Missing (%) | 29.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -15.74409755 |
| Minimum | -999.25 |
|---|---|
| Maximum | 1327.45 |
| Zeros | 19506 |
| Zeros (%) | 14.9% |
| Negative | 2501 |
| Negative (%) | 1.9% |
| Memory size | 1022.3 KiB |
Quantile statistics
| Minimum | -999.25 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.38 |
| median | 2.8 |
| Q3 | 5.3 |
| 95-th percentile | 38.9 |
| Maximum | 1327.45 |
| Range | 2326.7 |
| Interquartile range (IQR) | 4.92 |
Descriptive statistics
| Standard deviation | 177.0906565 |
|---|---|
| Coefficient of variation (CV) | -11.24806652 |
| Kurtosis | 29.54207752 |
| Mean | -15.74409755 |
| Median Absolute Deviation (MAD) | 2.5 |
| Skewness | -3.756498413 |
| Sum | -1443135.47 |
| Variance | 31361.10064 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 19506 | 14.9% |
| -999.25 | 2501 | 1.9% |
| 2.6 | 1154 | 0.9% |
| 2.8 | 1151 | 0.9% |
| 2.4 | 1096 | 0.8% |
| 2.7 | 995 | 0.8% |
| 2.9 | 973 | 0.7% |
| 3 | 936 | 0.7% |
| 3.2 | 935 | 0.7% |
| 3.3 | 894 | 0.7% |
| Other values (6028) | 61521 | |
| (Missing) | 39182 |
| Value | Count | Frequency (%) |
| -999.25 | 2501 | 1.9% |
| 0 | 19506 | |
| 0.01 | 4 | < 0.1% |
| 0.02 | 10 | < 0.1% |
| 0.03 | 6 | < 0.1% |
| Value | Count | Frequency (%) |
| 1327.45 | 1 | |
| 1325.79 | 1 | |
| 1323.22 | 1 | |
| 1322.88 | 1 | |
| 1321.9 | 1 |
Caliper
Real number (ℝ)
MISSING  ZEROS 
| Distinct | 121 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 74081 |
| Missing (%) | 56.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -39.34055635 |
| Minimum | -999.25 |
|---|---|
| Maximum | 20.3 |
| Zeros | 19529 |
| Zeros (%) | 14.9% |
| Negative | 2688 |
| Negative (%) | 2.1% |
| Memory size | 1022.3 KiB |
Quantile statistics
| Minimum | -999.25 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 12.1 |
| Q3 | 13.3 |
| 95-th percentile | 14.5 |
| Maximum | 20.3 |
| Range | 1019.55 |
| Interquartile range (IQR) | 13.3 |
Descriptive statistics
| Standard deviation | 214.1085291 |
|---|---|
| Coefficient of variation (CV) | -5.442437753 |
| Kurtosis | 16.13629663 |
| Mean | -39.34055635 |
| Median Absolute Deviation (MAD) | 2.1 |
| Skewness | -4.256424764 |
| Sum | -2233088 |
| Variance | 45842.46224 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 19529 | 14.9% |
| 12.1 | 5502 | 4.2% |
| 12.2 | 3154 | 2.4% |
| 12 | 2951 | 2.3% |
| -999.25 | 2688 | 2.1% |
| 14.3 | 1816 | 1.4% |
| 14.2 | 1567 | 1.2% |
| 11.9 | 1263 | 1.0% |
| 12.3 | 1255 | 1.0% |
| 13.8 | 1062 | 0.8% |
| Other values (111) | 15976 | 12.2% |
| (Missing) | 74081 |
| Value | Count | Frequency (%) |
| -999.25 | 2688 | 2.1% |
| 0 | 19529 | |
| 5.1 | 1 | < 0.1% |
| 6.6 | 2 | < 0.1% |
| 6.9 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 20.3 | 101 | |
| 20.2 | 1 | < 0.1% |
| 20.1 | 4 | < 0.1% |
| 20 | 6 | < 0.1% |
| 19.9 | 38 | < 0.1% |
Density
Real number (ℝ)
MISSING  ZEROS 
| Distinct | 382 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 74086 |
| Missing (%) | 56.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -44.81363403 |
| Minimum | -999.25 |
|---|---|
| Maximum | 4.3 |
| Zeros | 19534 |
| Zeros (%) | 14.9% |
| Negative | 2632 |
| Negative (%) | 2.0% |
| Memory size | 1022.3 KiB |
Quantile statistics
| Minimum | -999.25 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1.92 |
| Q3 | 2.68 |
| 95-th percentile | 3.36 |
| Maximum | 4.3 |
| Range | 1003.55 |
| Interquartile range (IQR) | 2.68 |
Descriptive statistics
| Standard deviation | 210.474142 |
|---|---|
| Coefficient of variation (CV) | -4.696654191 |
| Kurtosis | 16.61335756 |
| Mean | -44.81363403 |
| Median Absolute Deviation (MAD) | 1.2 |
| Skewness | -4.314156091 |
| Sum | -2543532.24 |
| Variance | 44299.36447 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 19534 | 14.9% |
| -999.25 | 2632 | 2.0% |
| 2.44 | 240 | 0.2% |
| 2.4 | 235 | 0.2% |
| 2.48 | 232 | 0.2% |
| 2.37 | 232 | 0.2% |
| 2.57 | 231 | 0.2% |
| 2.68 | 231 | 0.2% |
| 2.67 | 231 | 0.2% |
| 2.35 | 226 | 0.2% |
| Other values (372) | 32734 | |
| (Missing) | 74086 |
| Value | Count | Frequency (%) |
| -999.25 | 2632 | 2.0% |
| 0 | 19534 | |
| 0.41 | 1 | < 0.1% |
| 0.45 | 1 | < 0.1% |
| 0.46 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 4.3 | 1 | |
| 4.29 | 1 | |
| 4.28 | 1 | |
| 4.27 | 1 | |
| 4.25 | 1 |
| HOLEID | PROJECTCODE | GEOLFROM | GEOLTO | PRIORITY | Gamma | MagSusc | Caliper | Density | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | CB0002 | CB | 0.14 | 0.14 | 30 | 88.780 | 0.0 | 0.0 | 0.0 |
| 1 | CB0002 | CB | 0.24 | 0.24 | 30 | 88.180 | 0.0 | 0.0 | 0.0 |
| 2 | CB0002 | CB | 0.34 | 0.34 | 30 | 89.830 | 0.0 | 0.0 | 0.0 |
| 3 | CB0002 | CB | 0.44 | 0.44 | 30 | 89.735 | 0.0 | 0.0 | 0.0 |
| 4 | CB0002 | CB | 0.54 | 0.54 | 30 | 97.090 | 0.0 | 0.0 | 0.0 |
| 5 | CB0002 | CB | 0.64 | 0.64 | 30 | 92.170 | 0.0 | 0.0 | 0.0 |
| 6 | CB0002 | CB | 0.74 | 0.74 | 30 | 89.965 | 0.0 | 0.0 | 0.0 |
| 7 | CB0002 | CB | 0.84 | 0.84 | 30 | 87.570 | 0.0 | 0.0 | 0.0 |
| 8 | CB0002 | CB | 0.94 | 0.94 | 30 | 89.075 | 0.0 | 0.0 | 0.0 |
| 9 | CB0002 | CB | 1.04 | 1.04 | 30 | 92.485 | 0.0 | 0.0 | 0.0 |