Overview

Dataset statistics

Number of variables30
Number of observations350997
Missing cells6741010
Missing cells (%)64.0%
Total size in memory80.3 MiB
Average record size in memory240.0 B

Variable types

Text23
Numeric6
Unsupported1

Dataset

DescriptionA dataset from the WAMEX database.
URLhttps://www.dmp.wa.gov.au/WAMEX-Minerals-Exploration-1476.aspx

Alerts

PRIORITY has constant value ""Constant
Strat_Sum has 305746 (87.1%) missing valuesMissing
Strat has 346567 (98.7%) missing valuesMissing
Mj1 has 54470 (15.5%) missing valuesMissing
Mj2 has 172094 (49.0%) missing valuesMissing
Mj3 has 261399 (74.5%) missing valuesMissing
Mj4 has 302371 (86.1%) missing valuesMissing
Mj5 has 350871 (> 99.9%) missing valuesMissing
Mn1 has 223856 (63.8%) missing valuesMissing
Mn2 has 298133 (84.9%) missing valuesMissing
Mn3 has 325115 (92.6%) missing valuesMissing
Mn4 has 332154 (94.6%) missing valuesMissing
Mn5 has 350982 (> 99.9%) missing valuesMissing
Tr1 has 303543 (86.5%) missing valuesMissing
Tr2 has 337810 (96.2%) missing valuesMissing
Tr3 has 344588 (98.2%) missing valuesMissing
Tr4 has 339159 (96.6%) missing valuesMissing
Tr5 has 350836 (> 99.9%) missing valuesMissing
Chip_pct has 161178 (45.9%) missing valuesMissing
Shape1 has 71713 (20.4%) missing valuesMissing
Shape2 has 342475 (97.6%) missing valuesMissing
Max_Dia has 62831 (17.9%) missing valuesMissing
Hardness has 350997 (100.0%) missing valuesMissing
Colour has 102224 (29.1%) missing valuesMissing
LithComment has 302145 (86.1%) missing valuesMissing
Ore_Texture has 347753 (99.1%) missing valuesMissing
Hardness is an unsupported type, check if it needs cleaning or further analysisUnsupported
GEOLFROM has 14973 (4.3%) zerosZeros

Reproduction

Analysis started2023-07-19 23:01:26.125119
Analysis finished2023-07-19 23:01:36.396523
Duration10.27 seconds
Software versionydata-profiling vv4.3.1
Download configurationconfig.json

Variables

HOLEID
Text

Distinct6403
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
2023-07-20T07:01:37.360581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length6
Mean length6.011851953
Min length6

Characters and Unicode

Total characters2110142
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowCC0001
2nd rowCC0001
3rd rowCC0001
4th rowCC0001
5th rowCC0001
ValueCountFrequency (%)
cc1164 314
 
0.1%
cc1595 298
 
0.1%
cc1596 284
 
0.1%
cc1165 268
 
0.1%
cc1166 268
 
0.1%
cc1459 264
 
0.1%
cc1443 256
 
0.1%
cc1442 250
 
0.1%
cc0001 236
 
0.1%
cc1441 236
 
0.1%
Other values (6393) 348323
99.2%
2023-07-20T07:01:39.000885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 584375
27.7%
1 238862
11.3%
0 197982
 
9.4%
2 191223
 
9.1%
3 158961
 
7.5%
4 114346
 
5.4%
5 104310
 
4.9%
6 103931
 
4.9%
9 101964
 
4.8%
7 96961
 
4.6%
Other values (7) 217227
 
10.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1403988
66.5%
Uppercase Letter 706154
33.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 238862
17.0%
0 197982
14.1%
2 191223
13.6%
3 158961
11.3%
4 114346
8.1%
5 104310
7.4%
6 103931
7.4%
9 101964
7.3%
7 96961
6.9%
8 95448
 
6.8%
Uppercase Letter
ValueCountFrequency (%)
C 584375
82.8%
B 89825
 
12.7%
W 14447
 
2.0%
K 10434
 
1.5%
F 4013
 
0.6%
D 1960
 
0.3%
P 1100
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 1403988
66.5%
Latin 706154
33.5%

Most frequent character per script

Common
ValueCountFrequency (%)
1 238862
17.0%
0 197982
14.1%
2 191223
13.6%
3 158961
11.3%
4 114346
8.1%
5 104310
7.4%
6 103931
7.4%
9 101964
7.3%
7 96961
6.9%
8 95448
 
6.8%
Latin
ValueCountFrequency (%)
C 584375
82.8%
B 89825
 
12.7%
W 14447
 
2.0%
K 10434
 
1.5%
F 4013
 
0.6%
D 1960
 
0.3%
P 1100
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2110142
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 584375
27.7%
1 238862
11.3%
0 197982
 
9.4%
2 191223
 
9.1%
3 158961
 
7.5%
4 114346
 
5.4%
5 104310
 
4.9%
6 103931
 
4.9%
9 101964
 
4.8%
7 96961
 
4.6%
Other values (7) 217227
 
10.3%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
2023-07-20T07:01:39.393857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters701994
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCC
2nd rowCC
3rd rowCC
4th rowCC
5th rowCC
ValueCountFrequency (%)
cc 246725
70.3%
cb 89825
 
25.6%
wk 10434
 
3.0%
wf 4013
 
1.1%
2023-07-20T07:01:40.092993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 583275
83.1%
B 89825
 
12.8%
W 14447
 
2.1%
K 10434
 
1.5%
F 4013
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 701994
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 583275
83.1%
B 89825
 
12.8%
W 14447
 
2.1%
K 10434
 
1.5%
F 4013
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 701994
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 583275
83.1%
B 89825
 
12.8%
W 14447
 
2.1%
K 10434
 
1.5%
F 4013
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 701994
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 583275
83.1%
B 89825
 
12.8%
W 14447
 
2.1%
K 10434
 
1.5%
F 4013
 
0.6%

GEOLFROM
Real number (ℝ)

ZEROS 

Distinct614
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.56020222
Minimum0
Maximum143
Zeros14973
Zeros (%)4.3%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2023-07-20T07:01:40.545164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q19
median20
Q335
95-th percentile64
Maximum143
Range143
Interquartile range (IQR)26

Descriptive statistics

Standard deviation20.26223983
Coefficient of variation (CV)0.8250029722
Kurtosis1.639174278
Mean24.56020222
Median Absolute Deviation (MAD)12
Skewness1.216507032
Sum8620557.3
Variance410.558363
MonotonicityNot monotonic
2023-07-20T07:01:41.035496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 14973
 
4.3%
6 8581
 
2.4%
4 8562
 
2.4%
12 8543
 
2.4%
5 8506
 
2.4%
9 8502
 
2.4%
10 8487
 
2.4%
7 8465
 
2.4%
3 8464
 
2.4%
8 8456
 
2.4%
Other values (604) 259458
73.9%
ValueCountFrequency (%)
0 14973
4.3%
0.2 2
 
< 0.1%
0.3 2
 
< 0.1%
0.4 1
 
< 0.1%
0.5 14
 
< 0.1%
ValueCountFrequency (%)
143 2
< 0.1%
142 2
< 0.1%
141 2
< 0.1%
140 2
< 0.1%
139 4
< 0.1%

GEOLTO
Real number (ℝ)

Distinct631
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.30145272
Minimum0.2
Maximum144
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2023-07-20T07:01:41.505427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.2
5-th percentile3
Q111
median22
Q337
95-th percentile66
Maximum144
Range143.8
Interquartile range (IQR)26

Descriptive statistics

Standard deviation20.28137304
Coefficient of variation (CV)0.771112275
Kurtosis1.634311732
Mean26.30145272
Median Absolute Deviation (MAD)12
Skewness1.214693604
Sum9231731
Variance411.3340925
MonotonicityNot monotonic
2023-07-20T07:01:41.986023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10 8744
 
2.5%
16 8712
 
2.5%
12 8671
 
2.5%
4 8602
 
2.5%
6 8597
 
2.4%
5 8524
 
2.4%
9 8521
 
2.4%
7 8503
 
2.4%
3 8487
 
2.4%
8 8451
 
2.4%
Other values (621) 265185
75.6%
ValueCountFrequency (%)
0.2 2
 
< 0.1%
0.3 2
 
< 0.1%
0.4 1
 
< 0.1%
0.5 13
< 0.1%
0.6 1
 
< 0.1%
ValueCountFrequency (%)
144 4
< 0.1%
143 2
 
< 0.1%
142 2
 
< 0.1%
141 2
 
< 0.1%
140 6
< 0.1%

PRIORITY
Real number (ℝ)

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1
Minimum1
Maximum1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2023-07-20T07:01:42.453607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile1
Maximum1
Range0
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0
Coefficient of variation (CV)0
Kurtosis0
Mean1
Median Absolute Deviation (MAD)0
Skewness0
Sum350997
Variance0
MonotonicityIncreasing
2023-07-20T07:01:42.784503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)
ValueCountFrequency (%)
1 350997
100.0%
ValueCountFrequency (%)
1 350997
100.0%
ValueCountFrequency (%)
1 350997
100.0%

Strat_Sum
Text

MISSING 

Distinct24
Distinct (%)0.1%
Missing305746
Missing (%)87.1%
Memory size2.7 MiB
2023-07-20T07:01:43.308575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.740580319
Min length2

Characters and Unicode

Total characters124014
Distinct characters29
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTa
2nd rowTdi
3rd rowTds
4th rowTdm
5th rowMUb
ValueCountFrequency (%)
mum 8268
18.3%
mub 7237
16.0%
hc 5263
11.6%
ta 4720
10.4%
tdi 3116
 
6.9%
mus 2935
 
6.5%
muh 2914
 
6.4%
mut 2504
 
5.5%
tds 2154
 
4.8%
muf 2041
 
4.5%
Other values (14) 4099
9.1%
2023-07-20T07:01:44.196522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 27027
21.8%
U 27027
21.8%
T 11219
9.0%
m 8753
 
7.1%
b 7237
 
5.8%
d 5755
 
4.6%
H 5567
 
4.5%
s 5389
 
4.3%
c 5270
 
4.2%
a 4720
 
3.8%
Other values (19) 16050
12.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 73221
59.0%
Lowercase Letter 50793
41.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m 8753
17.2%
b 7237
14.2%
d 5755
11.3%
s 5389
10.6%
c 5270
10.4%
a 4720
9.3%
i 3116
 
6.1%
h 2914
 
5.7%
t 2504
 
4.9%
f 2334
 
4.6%
Other values (5) 2801
 
5.5%
Uppercase Letter
ValueCountFrequency (%)
M 27027
36.9%
U 27027
36.9%
T 11219
15.3%
H 5567
 
7.6%
C 778
 
1.1%
J 543
 
0.7%
D 434
 
0.6%
I 430
 
0.6%
F 93
 
0.1%
L 55
 
0.1%
Other values (4) 48
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 124014
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 27027
21.8%
U 27027
21.8%
T 11219
9.0%
m 8753
 
7.1%
b 7237
 
5.8%
d 5755
 
4.6%
H 5567
 
4.5%
s 5389
 
4.3%
c 5270
 
4.2%
a 4720
 
3.8%
Other values (19) 16050
12.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 124014
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 27027
21.8%
U 27027
21.8%
T 11219
9.0%
m 8753
 
7.1%
b 7237
 
5.8%
d 5755
 
4.6%
H 5567
 
4.5%
s 5389
 
4.3%
c 5270
 
4.2%
a 4720
 
3.8%
Other values (19) 16050
12.9%

Strat
Text

MISSING 

Distinct16
Distinct (%)0.4%
Missing346567
Missing (%)98.7%
Memory size2.7 MiB
2023-07-20T07:01:44.790320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.720993228
Min length2

Characters and Unicode

Total characters12054
Distinct characters21
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTa
2nd rowTa
3rd rowTa
4th rowTa
5th rowTa
ValueCountFrequency (%)
mub 836
18.9%
ta 702
15.8%
mum 648
14.6%
tdi 504
11.4%
hc 372
8.4%
muh 268
 
6.0%
tds 240
 
5.4%
mut 216
 
4.9%
muf 192
 
4.3%
tdm 142
 
3.2%
Other values (6) 310
 
7.0%
2023-07-20T07:01:45.816877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 2306
19.1%
U 2306
19.1%
T 1588
13.2%
d 886
 
7.4%
b 836
 
6.9%
m 790
 
6.6%
a 702
 
5.8%
i 504
 
4.2%
H 374
 
3.1%
c 372
 
3.1%
Other values (11) 1390
11.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 6736
55.9%
Lowercase Letter 5318
44.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 886
16.7%
b 836
15.7%
m 790
14.9%
a 702
13.2%
i 504
9.5%
c 372
7.0%
s 370
7.0%
h 268
 
5.0%
f 224
 
4.2%
t 216
 
4.1%
Other values (4) 150
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
M 2306
34.2%
U 2306
34.2%
T 1588
23.6%
H 374
 
5.6%
J 108
 
1.6%
C 32
 
0.5%
F 22
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 12054
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 2306
19.1%
U 2306
19.1%
T 1588
13.2%
d 886
 
7.4%
b 836
 
6.9%
m 790
 
6.6%
a 702
 
5.8%
i 504
 
4.2%
H 374
 
3.1%
c 372
 
3.1%
Other values (11) 1390
11.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12054
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 2306
19.1%
U 2306
19.1%
T 1588
13.2%
d 886
 
7.4%
b 836
 
6.9%
m 790
 
6.6%
a 702
 
5.8%
i 504
 
4.2%
H 374
 
3.1%
c 372
 
3.1%
Other values (11) 1390
11.5%

Mj1
Text

MISSING 

Distinct88
Distinct (%)< 0.1%
Missing54470
Missing (%)15.5%
Memory size2.7 MiB
2023-07-20T07:01:46.585360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.737501138
Min length2

Characters and Unicode

Total characters811743
Distinct characters26
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowGH
2nd rowGH
3rd rowGOM
4th rowHO
5th rowHO
ValueCountFrequency (%)
ch 70733
23.9%
ghm 36371
12.3%
gom 36269
12.2%
klf 25300
 
8.5%
vgh 21633
 
7.3%
goh 10550
 
3.6%
klp 8517
 
2.9%
hsm 8269
 
2.8%
hom 8224
 
2.8%
shm 7850
 
2.6%
Other values (78) 62811
21.2%
2023-07-20T07:01:47.632446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
H 201610
24.8%
G 137454
16.9%
M 114341
14.1%
C 72925
 
9.0%
O 62691
 
7.7%
F 50314
 
6.2%
K 40407
 
5.0%
L 40224
 
5.0%
S 40124
 
4.9%
V 21633
 
2.7%
Other values (16) 30020
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 804961
99.2%
Lowercase Letter 6782
 
0.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
H 201610
25.0%
G 137454
17.1%
M 114341
14.2%
C 72925
 
9.1%
O 62691
 
7.8%
F 50314
 
6.3%
K 40407
 
5.0%
L 40224
 
5.0%
S 40124
 
5.0%
V 21633
 
2.7%
Other values (13) 23238
 
2.9%
Lowercase Letter
ValueCountFrequency (%)
d 4031
59.4%
p 2747
40.5%
c 4
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 811743
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
H 201610
24.8%
G 137454
16.9%
M 114341
14.1%
C 72925
 
9.0%
O 62691
 
7.7%
F 50314
 
6.2%
K 40407
 
5.0%
L 40224
 
5.0%
S 40124
 
4.9%
V 21633
 
2.7%
Other values (16) 30020
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 811743
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
H 201610
24.8%
G 137454
16.9%
M 114341
14.1%
C 72925
 
9.0%
O 62691
 
7.7%
F 50314
 
6.2%
K 40407
 
5.0%
L 40224
 
5.0%
S 40124
 
4.9%
V 21633
 
2.7%
Other values (16) 30020
 
3.7%

Mj2
Text

MISSING 

Distinct93
Distinct (%)0.1%
Missing172094
Missing (%)49.0%
Memory size2.7 MiB
2023-07-20T07:01:48.327268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.856922466
Min length2

Characters and Unicode

Total characters511112
Distinct characters25
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowGO
2nd rowGO
3rd rowHO
4th rowGO
5th rowGH
ValueCountFrequency (%)
gom 22089
12.3%
ghm 22052
12.3%
ch 20268
11.3%
shm 16400
 
9.2%
klf 10474
 
5.9%
vgh 10177
 
5.7%
hsm 8827
 
4.9%
ogf 7288
 
4.1%
goh 6778
 
3.8%
hom 5861
 
3.3%
Other values (83) 48689
27.2%
2023-07-20T07:01:49.545585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
H 121231
23.7%
G 91153
17.8%
M 88331
17.3%
S 47255
 
9.2%
O 44786
 
8.8%
F 37703
 
7.4%
C 22148
 
4.3%
K 14956
 
2.9%
L 14851
 
2.9%
V 10177
 
2.0%
Other values (15) 18521
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 506839
99.2%
Lowercase Letter 4273
 
0.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
H 121231
23.9%
G 91153
18.0%
M 88331
17.4%
S 47255
 
9.3%
O 44786
 
8.8%
F 37703
 
7.4%
C 22148
 
4.4%
K 14956
 
3.0%
L 14851
 
2.9%
V 10177
 
2.0%
Other values (13) 14248
 
2.8%
Lowercase Letter
ValueCountFrequency (%)
d 2421
56.7%
p 1852
43.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 511112
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
H 121231
23.7%
G 91153
17.8%
M 88331
17.3%
S 47255
 
9.2%
O 44786
 
8.8%
F 37703
 
7.4%
C 22148
 
4.3%
K 14956
 
2.9%
L 14851
 
2.9%
V 10177
 
2.0%
Other values (15) 18521
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 511112
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
H 121231
23.7%
G 91153
17.8%
M 88331
17.3%
S 47255
 
9.2%
O 44786
 
8.8%
F 37703
 
7.4%
C 22148
 
4.3%
K 14956
 
2.9%
L 14851
 
2.9%
V 10177
 
2.0%
Other values (15) 18521
 
3.6%

Mj3
Text

MISSING 

Distinct81
Distinct (%)0.1%
Missing261399
Missing (%)74.5%
Memory size2.7 MiB
2023-07-20T07:01:50.530821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.84962834
Min length2

Characters and Unicode

Total characters255321
Distinct characters25
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowHO
2nd rowGH
3rd rowGO
4th rowGH
5th rowGO
ValueCountFrequency (%)
ghm 10636
11.9%
gom 9475
 
10.6%
klf 8177
 
9.1%
ch 7807
 
8.7%
shm 7383
 
8.2%
hsm 4924
 
5.5%
vgh 4622
 
5.2%
ogf 4147
 
4.6%
hsf 3350
 
3.7%
goh 3037
 
3.4%
Other values (71) 26040
29.1%
2023-07-20T07:01:51.717094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
H 54241
21.2%
G 42328
16.6%
M 42163
16.5%
S 24021
9.4%
F 22568
8.8%
O 20207
 
7.9%
K 12130
 
4.8%
L 11997
 
4.7%
C 8598
 
3.4%
V 4623
 
1.8%
Other values (15) 12445
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 253628
99.3%
Lowercase Letter 1693
 
0.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
H 54241
21.4%
G 42328
16.7%
M 42163
16.6%
S 24021
9.5%
F 22568
8.9%
O 20207
 
8.0%
K 12130
 
4.8%
L 11997
 
4.7%
C 8598
 
3.4%
V 4623
 
1.8%
Other values (13) 10752
 
4.2%
Lowercase Letter
ValueCountFrequency (%)
d 1020
60.2%
p 673
39.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 255321
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
H 54241
21.2%
G 42328
16.6%
M 42163
16.5%
S 24021
9.4%
F 22568
8.8%
O 20207
 
7.9%
K 12130
 
4.8%
L 11997
 
4.7%
C 8598
 
3.4%
V 4623
 
1.8%
Other values (15) 12445
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 255321
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
H 54241
21.2%
G 42328
16.6%
M 42163
16.5%
S 24021
9.4%
F 22568
8.8%
O 20207
 
7.9%
K 12130
 
4.8%
L 11997
 
4.7%
C 8598
 
3.4%
V 4623
 
1.8%
Other values (15) 12445
 
4.9%

Mj4
Text

MISSING 

Distinct74
Distinct (%)0.2%
Missing302371
Missing (%)86.1%
Memory size2.7 MiB
2023-07-20T07:01:52.396754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.849751162
Min length2

Characters and Unicode

Total characters138572
Distinct characters25
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowGSM
2nd rowHSM
3rd rowGOM
4th rowGSF
5th rowCH
ValueCountFrequency (%)
ghm 6031
12.4%
gom 5337
 
11.0%
klp 3791
 
7.8%
ch 3601
 
7.4%
shm 3414
 
7.0%
vgh 2817
 
5.8%
klf 2772
 
5.7%
hsm 2614
 
5.4%
pi 1846
 
3.8%
goh 1808
 
3.7%
Other values (64) 14595
30.0%
2023-07-20T07:01:53.303563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
H 27896
20.1%
G 23497
17.0%
M 23247
16.8%
S 11972
8.6%
O 10645
 
7.7%
F 9135
 
6.6%
K 7411
 
5.3%
L 7316
 
5.3%
P 5874
 
4.2%
C 3990
 
2.9%
Other values (15) 7589
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 137753
99.4%
Lowercase Letter 819
 
0.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
H 27896
20.3%
G 23497
17.1%
M 23247
16.9%
S 11972
8.7%
O 10645
 
7.7%
F 9135
 
6.6%
K 7411
 
5.4%
L 7316
 
5.3%
P 5874
 
4.3%
C 3990
 
2.9%
Other values (13) 6770
 
4.9%
Lowercase Letter
ValueCountFrequency (%)
d 602
73.5%
p 217
 
26.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 138572
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
H 27896
20.1%
G 23497
17.0%
M 23247
16.8%
S 11972
8.6%
O 10645
 
7.7%
F 9135
 
6.6%
K 7411
 
5.3%
L 7316
 
5.3%
P 5874
 
4.2%
C 3990
 
2.9%
Other values (15) 7589
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 138572
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
H 27896
20.1%
G 23497
17.0%
M 23247
16.8%
S 11972
8.6%
O 10645
 
7.7%
F 9135
 
6.6%
K 7411
 
5.3%
L 7316
 
5.3%
P 5874
 
4.2%
C 3990
 
2.9%
Other values (15) 7589
 
5.5%

Mj5
Text

MISSING 

Distinct25
Distinct (%)19.8%
Missing350871
Missing (%)> 99.9%
Memory size2.7 MiB
2023-07-20T07:01:53.851552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.793650794
Min length2

Characters and Unicode

Total characters352
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)6.3%

Sample

1st rowGSF
2nd rowPI
3rd rowHSF
4th rowHSF
5th rowGHM
ValueCountFrequency (%)
ghm 25
19.8%
gom 17
13.5%
vgh 15
11.9%
shm 12
9.5%
klf 9
 
7.1%
ch 7
 
5.6%
hsf 5
 
4.0%
vg 4
 
3.2%
gsf 3
 
2.4%
rs 3
 
2.4%
Other values (15) 26
20.6%
2023-07-20T07:01:54.663502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
G 71
20.2%
H 69
19.6%
M 65
18.5%
S 34
9.7%
O 24
 
6.8%
F 22
 
6.2%
V 19
 
5.4%
K 12
 
3.4%
L 12
 
3.4%
C 7
 
2.0%
Other values (6) 17
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 352
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
G 71
20.2%
H 69
19.6%
M 65
18.5%
S 34
9.7%
O 24
 
6.8%
F 22
 
6.2%
V 19
 
5.4%
K 12
 
3.4%
L 12
 
3.4%
C 7
 
2.0%
Other values (6) 17
 
4.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 352
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
G 71
20.2%
H 69
19.6%
M 65
18.5%
S 34
9.7%
O 24
 
6.8%
F 22
 
6.2%
V 19
 
5.4%
K 12
 
3.4%
L 12
 
3.4%
C 7
 
2.0%
Other values (6) 17
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 352
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
G 71
20.2%
H 69
19.6%
M 65
18.5%
S 34
9.7%
O 24
 
6.8%
F 22
 
6.2%
V 19
 
5.4%
K 12
 
3.4%
L 12
 
3.4%
C 7
 
2.0%
Other values (6) 17
 
4.8%

Mn1
Text

MISSING 

Distinct91
Distinct (%)0.1%
Missing223856
Missing (%)63.8%
Memory size2.7 MiB
2023-07-20T07:01:55.445927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.82475362
Min length2

Characters and Unicode

Total characters359142
Distinct characters25
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowMI
2nd rowMI
3rd rowMI
4th rowMI
5th rowMI
ValueCountFrequency (%)
ghm 13482
 
10.6%
gom 13066
 
10.3%
ch 11988
 
9.4%
vgh 8478
 
6.7%
ogf 7901
 
6.2%
hsm 7711
 
6.1%
klf 6038
 
4.7%
hom 5734
 
4.5%
shm 5576
 
4.4%
gsm 4530
 
3.6%
Other values (81) 42637
33.5%
2023-07-20T07:01:56.796308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
H 77352
21.5%
G 63125
17.6%
M 60334
16.8%
S 33429
9.3%
O 33218
9.2%
F 31648
8.8%
C 14521
 
4.0%
K 10018
 
2.8%
L 9304
 
2.6%
V 8484
 
2.4%
Other values (15) 17709
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 356731
99.3%
Lowercase Letter 2411
 
0.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
H 77352
21.7%
G 63125
17.7%
M 60334
16.9%
S 33429
9.4%
O 33218
9.3%
F 31648
8.9%
C 14521
 
4.1%
K 10018
 
2.8%
L 9304
 
2.6%
V 8484
 
2.4%
Other values (13) 15298
 
4.3%
Lowercase Letter
ValueCountFrequency (%)
d 1483
61.5%
p 928
38.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 359142
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
H 77352
21.5%
G 63125
17.6%
M 60334
16.8%
S 33429
9.3%
O 33218
9.2%
F 31648
8.8%
C 14521
 
4.0%
K 10018
 
2.8%
L 9304
 
2.6%
V 8484
 
2.4%
Other values (15) 17709
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 359142
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
H 77352
21.5%
G 63125
17.6%
M 60334
16.8%
S 33429
9.3%
O 33218
9.2%
F 31648
8.8%
C 14521
 
4.0%
K 10018
 
2.8%
L 9304
 
2.6%
V 8484
 
2.4%
Other values (15) 17709
 
4.9%

Mn2
Text

MISSING 

Distinct84
Distinct (%)0.2%
Missing298133
Missing (%)84.9%
Memory size2.7 MiB
2023-07-20T07:01:57.544929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.809000454
Min length2

Characters and Unicode

Total characters148495
Distinct characters25
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowGH
2nd rowHO
3rd rowOG
4th rowOG
5th rowHO
ValueCountFrequency (%)
gom 5480
 
10.4%
ghm 4933
 
9.3%
ch 3924
 
7.4%
hsm 3674
 
6.9%
ogf 3374
 
6.4%
vgh 3180
 
6.0%
shm 3129
 
5.9%
hom 2201
 
4.2%
gsm 2125
 
4.0%
mi 2057
 
3.9%
Other values (74) 18787
35.5%
2023-07-20T07:01:58.477398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
H 29691
20.0%
M 27068
18.2%
G 25185
17.0%
S 15553
10.5%
O 13333
9.0%
F 12645
8.5%
C 4476
 
3.0%
K 3880
 
2.6%
I 3654
 
2.5%
L 3506
 
2.4%
Other values (15) 9504
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 147783
99.5%
Lowercase Letter 712
 
0.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
H 29691
20.1%
M 27068
18.3%
G 25185
17.0%
S 15553
10.5%
O 13333
9.0%
F 12645
8.6%
C 4476
 
3.0%
K 3880
 
2.6%
I 3654
 
2.5%
L 3506
 
2.4%
Other values (13) 8792
 
5.9%
Lowercase Letter
ValueCountFrequency (%)
d 422
59.3%
p 290
40.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 148495
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
H 29691
20.0%
M 27068
18.2%
G 25185
17.0%
S 15553
10.5%
O 13333
9.0%
F 12645
8.5%
C 4476
 
3.0%
K 3880
 
2.6%
I 3654
 
2.5%
L 3506
 
2.4%
Other values (15) 9504
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 148495
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
H 29691
20.0%
M 27068
18.2%
G 25185
17.0%
S 15553
10.5%
O 13333
9.0%
F 12645
8.5%
C 4476
 
3.0%
K 3880
 
2.6%
I 3654
 
2.5%
L 3506
 
2.4%
Other values (15) 9504
 
6.4%

Mn3
Text

MISSING 

Distinct72
Distinct (%)0.3%
Missing325115
Missing (%)92.6%
Memory size2.7 MiB
2023-07-20T07:01:59.258997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.742485125
Min length2

Characters and Unicode

Total characters70981
Distinct characters25
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowOG
2nd rowWC
3rd rowKS
4th rowGOM
5th rowGSF
ValueCountFrequency (%)
gom 2394
 
9.2%
ghm 2379
 
9.2%
mi 2207
 
8.5%
ch 1878
 
7.3%
hsm 1632
 
6.3%
shm 1327
 
5.1%
vgh 1317
 
5.1%
gsf 1205
 
4.7%
gsm 1108
 
4.3%
hom 1089
 
4.2%
Other values (62) 9346
36.1%
2023-07-20T07:02:00.472444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 13954
19.7%
H 12977
18.3%
G 11303
15.9%
S 7297
10.3%
O 5429
 
7.6%
F 5406
 
7.6%
I 3278
 
4.6%
K 2194
 
3.1%
C 2105
 
3.0%
P 2063
 
2.9%
Other values (15) 4975
 
7.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 70747
99.7%
Lowercase Letter 234
 
0.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 13954
19.7%
H 12977
18.3%
G 11303
16.0%
S 7297
10.3%
O 5429
 
7.7%
F 5406
 
7.6%
I 3278
 
4.6%
K 2194
 
3.1%
C 2105
 
3.0%
P 2063
 
2.9%
Other values (13) 4741
 
6.7%
Lowercase Letter
ValueCountFrequency (%)
d 121
51.7%
p 113
48.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 70981
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 13954
19.7%
H 12977
18.3%
G 11303
15.9%
S 7297
10.3%
O 5429
 
7.6%
F 5406
 
7.6%
I 3278
 
4.6%
K 2194
 
3.1%
C 2105
 
3.0%
P 2063
 
2.9%
Other values (15) 4975
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 70981
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 13954
19.7%
H 12977
18.3%
G 11303
15.9%
S 7297
10.3%
O 5429
 
7.6%
F 5406
 
7.6%
I 3278
 
4.6%
K 2194
 
3.1%
C 2105
 
3.0%
P 2063
 
2.9%
Other values (15) 4975
 
7.0%

Mn4
Text

MISSING 

Distinct62
Distinct (%)0.3%
Missing332154
Missing (%)94.6%
Memory size2.7 MiB
2023-07-20T07:02:01.332034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.649684233
Min length2

Characters and Unicode

Total characters49928
Distinct characters25
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st rowHSM
2nd rowCH
3rd rowGOM
4th rowVGH
5th rowCH
ValueCountFrequency (%)
mi 1855
 
9.8%
ch 1576
 
8.4%
gom 1553
 
8.2%
pi 1511
 
8.0%
ghm 1088
 
5.8%
klp 938
 
5.0%
vgh 888
 
4.7%
gsf 861
 
4.6%
hsf 804
 
4.3%
hsm 761
 
4.0%
Other values (52) 7008
37.2%
2023-07-20T07:02:02.504164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 8861
17.7%
H 8107
16.2%
G 6599
13.2%
S 4728
9.5%
F 3961
7.9%
I 3370
 
6.7%
O 3290
 
6.6%
P 2527
 
5.1%
K 2222
 
4.5%
L 1720
 
3.4%
Other values (15) 4543
9.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 49808
99.8%
Lowercase Letter 120
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 8861
17.8%
H 8107
16.3%
G 6599
13.2%
S 4728
9.5%
F 3961
8.0%
I 3370
 
6.8%
O 3290
 
6.6%
P 2527
 
5.1%
K 2222
 
4.5%
L 1720
 
3.5%
Other values (13) 4423
8.9%
Lowercase Letter
ValueCountFrequency (%)
d 68
56.7%
p 52
43.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 49928
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 8861
17.7%
H 8107
16.2%
G 6599
13.2%
S 4728
9.5%
F 3961
7.9%
I 3370
 
6.7%
O 3290
 
6.6%
P 2527
 
5.1%
K 2222
 
4.5%
L 1720
 
3.4%
Other values (15) 4543
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 49928
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 8861
17.7%
H 8107
16.2%
G 6599
13.2%
S 4728
9.5%
F 3961
7.9%
I 3370
 
6.7%
O 3290
 
6.6%
P 2527
 
5.1%
K 2222
 
4.5%
L 1720
 
3.4%
Other values (15) 4543
9.1%

Mn5
Text

MISSING 

Distinct5
Distinct (%)33.3%
Missing350982
Missing (%)> 99.9%
Memory size2.7 MiB
2023-07-20T07:02:03.064422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length2
Mean length2.2
Min length2

Characters and Unicode

Total characters33
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)13.3%

Sample

1st rowOGF
2nd rowCH
3rd rowPI
4th rowPI
5th rowCH
ValueCountFrequency (%)
pi 6
40.0%
ch 5
33.3%
ogf 2
 
13.3%
ys 1
 
6.7%
ggm 1
 
6.7%
2023-07-20T07:02:03.850927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
P 6
18.2%
I 6
18.2%
C 5
15.2%
H 5
15.2%
G 4
12.1%
O 2
 
6.1%
F 2
 
6.1%
Y 1
 
3.0%
S 1
 
3.0%
M 1
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 33
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 6
18.2%
I 6
18.2%
C 5
15.2%
H 5
15.2%
G 4
12.1%
O 2
 
6.1%
F 2
 
6.1%
Y 1
 
3.0%
S 1
 
3.0%
M 1
 
3.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 33
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 6
18.2%
I 6
18.2%
C 5
15.2%
H 5
15.2%
G 4
12.1%
O 2
 
6.1%
F 2
 
6.1%
Y 1
 
3.0%
S 1
 
3.0%
M 1
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 6
18.2%
I 6
18.2%
C 5
15.2%
H 5
15.2%
G 4
12.1%
O 2
 
6.1%
F 2
 
6.1%
Y 1
 
3.0%
S 1
 
3.0%
M 1
 
3.0%

Tr1
Text

MISSING 

Distinct76
Distinct (%)0.2%
Missing303543
Missing (%)86.5%
Memory size2.7 MiB
2023-07-20T07:02:04.423507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.320120538
Min length2

Characters and Unicode

Total characters110099
Distinct characters25
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowHS
2nd rowHS
3rd rowHS
4th rowKS
5th rowMI
ValueCountFrequency (%)
mi 18742
39.5%
mn 3323
 
7.0%
ch 2966
 
6.3%
ka 2546
 
5.4%
ogf 1891
 
4.0%
gom 1468
 
3.1%
pi 1420
 
3.0%
ghm 1264
 
2.7%
vgh 1188
 
2.5%
klp 947
 
2.0%
Other values (66) 11699
24.7%
2023-07-20T07:02:05.697837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 30157
27.4%
I 20545
18.7%
H 10834
 
9.8%
G 8415
 
7.6%
O 6139
 
5.6%
F 5395
 
4.9%
S 5243
 
4.8%
K 4981
 
4.5%
N 3709
 
3.4%
C 3403
 
3.1%
Other values (15) 11278
 
10.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 109867
99.8%
Lowercase Letter 232
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 30157
27.4%
I 20545
18.7%
H 10834
 
9.9%
G 8415
 
7.7%
O 6139
 
5.6%
F 5395
 
4.9%
S 5243
 
4.8%
K 4981
 
4.5%
N 3709
 
3.4%
C 3403
 
3.1%
Other values (13) 11046
 
10.1%
Lowercase Letter
ValueCountFrequency (%)
p 120
51.7%
d 112
48.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 110099
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 30157
27.4%
I 20545
18.7%
H 10834
 
9.8%
G 8415
 
7.6%
O 6139
 
5.6%
F 5395
 
4.9%
S 5243
 
4.8%
K 4981
 
4.5%
N 3709
 
3.4%
C 3403
 
3.1%
Other values (15) 11278
 
10.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 110099
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 30157
27.4%
I 20545
18.7%
H 10834
 
9.8%
G 8415
 
7.6%
O 6139
 
5.6%
F 5395
 
4.9%
S 5243
 
4.8%
K 4981
 
4.5%
N 3709
 
3.4%
C 3403
 
3.1%
Other values (15) 11278
 
10.2%

Tr2
Text

MISSING 

Distinct63
Distinct (%)0.5%
Missing337810
Missing (%)96.2%
Memory size2.7 MiB
2023-07-20T07:02:06.296890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.390005308
Min length2

Characters and Unicode

Total characters31517
Distinct characters25
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowKS
2nd rowKLF
3rd rowSHF
4th rowSHF
5th rowOGF
ValueCountFrequency (%)
mi 4357
33.0%
pi 947
 
7.2%
ka 914
 
6.9%
klp 732
 
5.6%
ch 555
 
4.2%
ghm 523
 
4.0%
gom 475
 
3.6%
mn 443
 
3.4%
ogf 420
 
3.2%
mo 351
 
2.7%
Other values (53) 3470
26.3%
2023-07-20T07:02:07.328320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 7585
24.1%
I 5425
17.2%
H 3002
 
9.5%
G 2546
 
8.1%
K 2050
 
6.5%
O 1751
 
5.6%
P 1698
 
5.4%
F 1670
 
5.3%
S 1660
 
5.3%
L 1136
 
3.6%
Other values (15) 2994
 
9.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 31450
99.8%
Lowercase Letter 67
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 7585
24.1%
I 5425
17.2%
H 3002
 
9.5%
G 2546
 
8.1%
K 2050
 
6.5%
O 1751
 
5.6%
P 1698
 
5.4%
F 1670
 
5.3%
S 1660
 
5.3%
L 1136
 
3.6%
Other values (13) 2927
 
9.3%
Lowercase Letter
ValueCountFrequency (%)
p 44
65.7%
d 23
34.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 31517
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 7585
24.1%
I 5425
17.2%
H 3002
 
9.5%
G 2546
 
8.1%
K 2050
 
6.5%
O 1751
 
5.6%
P 1698
 
5.4%
F 1670
 
5.3%
S 1660
 
5.3%
L 1136
 
3.6%
Other values (15) 2994
 
9.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31517
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 7585
24.1%
I 5425
17.2%
H 3002
 
9.5%
G 2546
 
8.1%
K 2050
 
6.5%
O 1751
 
5.6%
P 1698
 
5.4%
F 1670
 
5.3%
S 1660
 
5.3%
L 1136
 
3.6%
Other values (15) 2994
 
9.5%

Tr3
Text

MISSING 

Distinct45
Distinct (%)0.7%
Missing344588
Missing (%)98.2%
Memory size2.7 MiB
2023-07-20T07:02:08.080057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.494304884
Min length2

Characters and Unicode

Total characters15986
Distinct characters24
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.1%

Sample

1st rowOGF
2nd rowGOM
3rd rowMI
4th rowMI
5th rowMI
ValueCountFrequency (%)
mi 1971
30.8%
klp 847
13.2%
ghm 549
 
8.6%
ch 369
 
5.8%
gom 303
 
4.7%
ka 296
 
4.6%
ogf 172
 
2.7%
pi 159
 
2.5%
vgh 148
 
2.3%
mn 132
 
2.1%
Other values (35) 1463
22.8%
2023-07-20T07:02:09.215932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 3664
22.9%
I 2213
13.8%
H 1823
11.4%
G 1628
10.2%
K 1188
 
7.4%
P 1022
 
6.4%
L 892
 
5.6%
O 864
 
5.4%
F 667
 
4.2%
S 628
 
3.9%
Other values (14) 1397
 
8.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 15952
99.8%
Lowercase Letter 34
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 3664
23.0%
I 2213
13.9%
H 1823
11.4%
G 1628
10.2%
K 1188
 
7.4%
P 1022
 
6.4%
L 892
 
5.6%
O 864
 
5.4%
F 667
 
4.2%
S 628
 
3.9%
Other values (12) 1363
 
8.5%
Lowercase Letter
ValueCountFrequency (%)
p 19
55.9%
d 15
44.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 15986
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 3664
22.9%
I 2213
13.8%
H 1823
11.4%
G 1628
10.2%
K 1188
 
7.4%
P 1022
 
6.4%
L 892
 
5.6%
O 864
 
5.4%
F 667
 
4.2%
S 628
 
3.9%
Other values (14) 1397
 
8.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15986
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 3664
22.9%
I 2213
13.8%
H 1823
11.4%
G 1628
10.2%
K 1188
 
7.4%
P 1022
 
6.4%
L 892
 
5.6%
O 864
 
5.4%
F 667
 
4.2%
S 628
 
3.9%
Other values (14) 1397
 
8.7%

Tr4
Text

MISSING 

Distinct40
Distinct (%)0.3%
Missing339159
Missing (%)96.6%
Memory size2.7 MiB
2023-07-20T07:02:09.881966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.151207974
Min length2

Characters and Unicode

Total characters25466
Distinct characters21
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowMI
2nd rowMI
3rd rowMI
4th rowMI
5th rowMI
ValueCountFrequency (%)
mi 6796
57.4%
mo 2090
 
17.7%
mn 600
 
5.1%
ghm 345
 
2.9%
mnf 281
 
2.4%
klp 268
 
2.3%
ogf 259
 
2.2%
ch 219
 
1.8%
hof 140
 
1.2%
pi 112
 
0.9%
Other values (30) 728
 
6.1%
2023-07-20T07:02:10.929543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 10447
41.0%
I 6910
27.1%
O 2678
 
10.5%
H 974
 
3.8%
G 925
 
3.6%
N 891
 
3.5%
F 772
 
3.0%
P 415
 
1.6%
K 392
 
1.5%
L 302
 
1.2%
Other values (11) 760
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 25459
> 99.9%
Lowercase Letter 7
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 10447
41.0%
I 6910
27.1%
O 2678
 
10.5%
H 974
 
3.8%
G 925
 
3.6%
N 891
 
3.5%
F 772
 
3.0%
P 415
 
1.6%
K 392
 
1.5%
L 302
 
1.2%
Other values (9) 753
 
3.0%
Lowercase Letter
ValueCountFrequency (%)
p 6
85.7%
d 1
 
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 25466
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 10447
41.0%
I 6910
27.1%
O 2678
 
10.5%
H 974
 
3.8%
G 925
 
3.6%
N 891
 
3.5%
F 772
 
3.0%
P 415
 
1.6%
K 392
 
1.5%
L 302
 
1.2%
Other values (11) 760
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25466
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 10447
41.0%
I 6910
27.1%
O 2678
 
10.5%
H 974
 
3.8%
G 925
 
3.6%
N 891
 
3.5%
F 772
 
3.0%
P 415
 
1.6%
K 392
 
1.5%
L 302
 
1.2%
Other values (11) 760
 
3.0%

Tr5
Text

MISSING 

Distinct5
Distinct (%)3.1%
Missing350836
Missing (%)> 99.9%
Memory size2.7 MiB
2023-07-20T07:02:11.166189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length2
Mean length2.01863354
Min length2

Characters and Unicode

Total characters325
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.6%

Sample

1st rowGHM
2nd rowGHM
3rd rowMI
4th rowMI
5th rowMI
ValueCountFrequency (%)
mi 153
95.0%
pi 3
 
1.9%
ghm 2
 
1.2%
mo 2
 
1.2%
klp 1
 
0.6%
2023-07-20T07:02:11.926135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 157
48.3%
I 156
48.0%
P 4
 
1.2%
G 2
 
0.6%
H 2
 
0.6%
O 2
 
0.6%
K 1
 
0.3%
L 1
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 325
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 157
48.3%
I 156
48.0%
P 4
 
1.2%
G 2
 
0.6%
H 2
 
0.6%
O 2
 
0.6%
K 1
 
0.3%
L 1
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 325
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 157
48.3%
I 156
48.0%
P 4
 
1.2%
G 2
 
0.6%
H 2
 
0.6%
O 2
 
0.6%
K 1
 
0.3%
L 1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 325
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 157
48.3%
I 156
48.0%
P 4
 
1.2%
G 2
 
0.6%
H 2
 
0.6%
O 2
 
0.6%
K 1
 
0.3%
L 1
 
0.3%

Chip_pct
Real number (ℝ)

MISSING 

Distinct45
Distinct (%)< 0.1%
Missing161178
Missing (%)45.9%
Infinite0
Infinite (%)0.0%
Mean34.27222775
Minimum0
Maximum100
Zeros22
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2023-07-20T07:02:12.276605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile10
Q120
median30
Q340
95-th percentile65
Maximum100
Range100
Interquartile range (IQR)20

Descriptive statistics

Standard deviation16.37143038
Coefficient of variation (CV)0.4776879548
Kurtosis0.5188388138
Mean34.27222775
Median Absolute Deviation (MAD)10
Skewness0.7228901153
Sum6505520
Variance268.0237327
MonotonicityNot monotonic
2023-07-20T07:02:12.732512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
30 42579
 
12.1%
40 35170
 
10.0%
20 31516
 
9.0%
50 21133
 
6.0%
10 12707
 
3.6%
25 9669
 
2.8%
60 9648
 
2.7%
15 6362
 
1.8%
35 4687
 
1.3%
70 3669
 
1.0%
Other values (35) 12679
 
3.6%
(Missing) 161178
45.9%
ValueCountFrequency (%)
0 22
< 0.1%
1 5
 
< 0.1%
2 43
< 0.1%
3 39
< 0.1%
4 6
 
< 0.1%
ValueCountFrequency (%)
100 100
 
< 0.1%
95 47
 
< 0.1%
90 911
0.3%
86 1
 
< 0.1%
85 141
 
< 0.1%

Shape1
Text

MISSING 

Distinct6
Distinct (%)< 0.1%
Missing71713
Missing (%)20.4%
Memory size2.7 MiB
2023-07-20T07:02:13.214985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters558568
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSR
2nd rowSR
3rd rowSR
4th rowSR
5th rowSR
ValueCountFrequency (%)
aa 120593
43.2%
sa 110536
39.6%
sr 32558
 
11.7%
rr 8529
 
3.1%
va 6293
 
2.3%
wr 775
 
0.3%
2023-07-20T07:02:14.118201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 358015
64.1%
S 143094
 
25.6%
R 50391
 
9.0%
V 6293
 
1.1%
W 775
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 558568
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 358015
64.1%
S 143094
 
25.6%
R 50391
 
9.0%
V 6293
 
1.1%
W 775
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 558568
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 358015
64.1%
S 143094
 
25.6%
R 50391
 
9.0%
V 6293
 
1.1%
W 775
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 558568
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 358015
64.1%
S 143094
 
25.6%
R 50391
 
9.0%
V 6293
 
1.1%
W 775
 
0.1%

Shape2
Text

MISSING 

Distinct4
Distinct (%)< 0.1%
Missing342475
Missing (%)97.6%
Memory size2.7 MiB
2023-07-20T07:02:14.543227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters17044
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSA
2nd rowSA
3rd rowSR
4th rowSR
5th rowSR
ValueCountFrequency (%)
sr 3003
35.2%
sa 2810
33.0%
rr 1492
17.5%
aa 1217
14.3%
2023-07-20T07:02:15.487345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 5987
35.1%
S 5813
34.1%
A 5244
30.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 17044
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 5987
35.1%
S 5813
34.1%
A 5244
30.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 17044
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 5987
35.1%
S 5813
34.1%
A 5244
30.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17044
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 5987
35.1%
S 5813
34.1%
A 5244
30.8%

Max_Dia
Real number (ℝ)

MISSING 

Distinct56
Distinct (%)< 0.1%
Missing62831
Missing (%)17.9%
Infinite0
Infinite (%)0.0%
Mean14.60768793
Minimum0
Maximum525
Zeros24
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2023-07-20T07:02:15.995175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q110
median12
Q320
95-th percentile30
Maximum525
Range525
Interquartile range (IQR)10

Descriptive statistics

Standard deviation7.195335875
Coefficient of variation (CV)0.4925718505
Kurtosis209.682909
Mean14.60768793
Median Absolute Deviation (MAD)3
Skewness4.28658828
Sum4209439
Variance51.77285836
MonotonicityNot monotonic
2023-07-20T07:02:16.671538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10 117872
33.6%
20 55764
15.9%
15 54933
15.7%
5 20654
 
5.9%
25 15768
 
4.5%
30 10603
 
3.0%
35 3038
 
0.9%
7 2799
 
0.8%
40 2069
 
0.6%
12 881
 
0.3%
Other values (46) 3785
 
1.1%
(Missing) 62831
17.9%
ValueCountFrequency (%)
0 24
 
< 0.1%
1 51
 
< 0.1%
2 510
0.1%
3 717
0.2%
4 237
 
0.1%
ValueCountFrequency (%)
525 2
< 0.1%
405 1
 
< 0.1%
151 1
 
< 0.1%
145 1
 
< 0.1%
140 3
< 0.1%

Hardness
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing350997
Missing (%)100.0%
Memory size2.7 MiB

Colour
Real number (ℝ)

MISSING 

Distinct90
Distinct (%)< 0.1%
Missing102224
Missing (%)29.1%
Infinite0
Infinite (%)0.0%
Mean27.14761248
Minimum1
Maximum96
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2023-07-20T07:02:17.278992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q121
median25
Q334
95-th percentile50
Maximum96
Range95
Interquartile range (IQR)13

Descriptive statistics

Standard deviation12.72642994
Coefficient of variation (CV)0.4687863415
Kurtosis4.562979416
Mean27.14761248
Median Absolute Deviation (MAD)7
Skewness1.558933637
Sum6753593
Variance161.9620189
MonotonicityNot monotonic
2023-07-20T07:02:17.870030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24 32057
 
9.1%
25 30375
 
8.7%
34 23317
 
6.6%
33 12439
 
3.5%
22 12281
 
3.5%
27 11128
 
3.2%
35 10710
 
3.1%
17 10555
 
3.0%
14 8783
 
2.5%
18 8587
 
2.4%
Other values (80) 88541
25.2%
(Missing) 102224
29.1%
ValueCountFrequency (%)
1 206
 
0.1%
2 779
 
0.2%
3 1209
0.3%
4 1507
0.4%
5 2449
0.7%
ValueCountFrequency (%)
96 26
 
< 0.1%
95 12
 
< 0.1%
94 2
 
< 0.1%
93 359
0.1%
92 143
 
< 0.1%

LithComment
Text

MISSING 

Distinct13963
Distinct (%)28.6%
Missing302145
Missing (%)86.1%
Memory size2.7 MiB
2023-07-20T07:02:19.530149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length146
Median length103
Mean length14.23139278
Min length1

Characters and Unicode

Total characters695232
Distinct characters87
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7455 ?
Unique (%)15.3%

Sample

1st rowInterbedded shales
2nd rowInterbedded shales
3rd rowShales interbedded
4th rowBlack shales with chert
5th rowDuplicate missed out
ValueCountFrequency (%)
7794
 
6.3%
duplicate 5354
 
4.3%
damp 3286
 
2.6%
mn 2957
 
2.4%
eoh 2753
 
2.2%
lab 2659
 
2.1%
clay 2550
 
2.1%
wet 2290
 
1.8%
to 1612
 
1.3%
stained 1515
 
1.2%
Other values (9621) 91424
73.6%
2023-07-20T07:02:23.075101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
81028
 
11.7%
e 28985
 
4.2%
i 24694
 
3.6%
E 24249
 
3.5%
S 22931
 
3.3%
a 22755
 
3.3%
T 22345
 
3.2%
A 21515
 
3.1%
I 21411
 
3.1%
t 21020
 
3.0%
Other values (77) 404299
58.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 297852
42.8%
Lowercase Letter 238018
34.2%
Space Separator 81028
 
11.7%
Decimal Number 59085
 
8.5%
Other Punctuation 10070
 
1.4%
Dash Punctuation 6356
 
0.9%
Close Punctuation 995
 
0.1%
Open Punctuation 993
 
0.1%
Math Symbol 814
 
0.1%
Connector Punctuation 19
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 28985
12.2%
i 24694
10.4%
a 22755
 
9.6%
t 21020
 
8.8%
l 17424
 
7.3%
s 14965
 
6.3%
n 13816
 
5.8%
o 11902
 
5.0%
c 11847
 
5.0%
r 9937
 
4.2%
Other values (16) 60673
25.5%
Uppercase Letter
ValueCountFrequency (%)
E 24249
 
8.1%
S 22931
 
7.7%
T 22345
 
7.5%
A 21515
 
7.2%
I 21411
 
7.2%
D 20455
 
6.9%
L 20231
 
6.8%
M 19681
 
6.6%
O 18421
 
6.2%
N 13326
 
4.5%
Other values (16) 93287
31.3%
Other Punctuation
ValueCountFrequency (%)
. 3176
31.5%
? 1868
18.6%
/ 1736
17.2%
@ 1439
14.3%
% 562
 
5.6%
& 389
 
3.9%
* 255
 
2.5%
: 253
 
2.5%
' 198
 
2.0%
! 102
 
1.0%
Other values (2) 92
 
0.9%
Decimal Number
ValueCountFrequency (%)
3 7490
12.7%
2 7406
12.5%
1 6714
11.4%
0 6383
10.8%
4 6226
10.5%
9 6076
10.3%
6 4975
8.4%
5 4760
8.1%
8 4590
7.8%
7 4465
7.6%
Math Symbol
ValueCountFrequency (%)
= 483
59.3%
< 250
30.7%
> 38
 
4.7%
+ 37
 
4.5%
~ 6
 
0.7%
Close Punctuation
ValueCountFrequency (%)
) 993
99.8%
] 2
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 991
99.8%
[ 2
 
0.2%
Space Separator
ValueCountFrequency (%)
81028
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6356
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 19
100.0%
Other Symbol
ValueCountFrequency (%)
� 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 535870
77.1%
Common 159362
 
22.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 28985
 
5.4%
i 24694
 
4.6%
E 24249
 
4.5%
S 22931
 
4.3%
a 22755
 
4.2%
T 22345
 
4.2%
A 21515
 
4.0%
I 21411
 
4.0%
t 21020
 
3.9%
D 20455
 
3.8%
Other values (42) 305510
57.0%
Common
ValueCountFrequency (%)
81028
50.8%
3 7490
 
4.7%
2 7406
 
4.6%
1 6714
 
4.2%
0 6383
 
4.0%
- 6356
 
4.0%
4 6226
 
3.9%
9 6076
 
3.8%
6 4975
 
3.1%
5 4760
 
3.0%
Other values (25) 21948
 
13.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 695230
> 99.9%
Specials 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
81028
 
11.7%
e 28985
 
4.2%
i 24694
 
3.6%
E 24249
 
3.5%
S 22931
 
3.3%
a 22755
 
3.3%
T 22345
 
3.2%
A 21515
 
3.1%
I 21411
 
3.1%
t 21020
 
3.0%
Other values (76) 404297
58.2%
Specials
ValueCountFrequency (%)
� 2
100.0%

Ore_Texture
Text

MISSING 

Distinct1030
Distinct (%)31.8%
Missing347753
Missing (%)99.1%
Memory size2.7 MiB
2023-07-20T07:02:25.018557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length84
Median length59
Mean length17.19235512
Min length1

Characters and Unicode

Total characters55772
Distinct characters83
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique195 ?
Unique (%)6.0%

Sample

1st row low recovery
2nd row Steel blue hematite and bedding
3rd row clay in P of GHHp
4th row clay in P of GHHp
5th row clay in P of GHHp
ValueCountFrequency (%)
588
 
6.3%
duplicate 274
 
2.9%
clay 227
 
2.4%
mn 165
 
1.8%
of 159
 
1.7%
in 144
 
1.5%
injected 142
 
1.5%
damp 141
 
1.5%
and 115
 
1.2%
vgh 114
 
1.2%
Other values (984) 7283
77.9%
2023-07-20T07:02:27.046454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9583
 
17.2%
e 3823
 
6.9%
i 3036
 
5.4%
a 2790
 
5.0%
t 2590
 
4.6%
l 2371
 
4.3%
s 2253
 
4.0%
n 1805
 
3.2%
c 1732
 
3.1%
o 1728
 
3.1%
Other values (73) 24061
43.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 32928
59.0%
Uppercase Letter 9647
 
17.3%
Space Separator 9583
 
17.2%
Decimal Number 2095
 
3.8%
Other Punctuation 907
 
1.6%
Dash Punctuation 394
 
0.7%
Close Punctuation 78
 
0.1%
Open Punctuation 74
 
0.1%
Math Symbol 64
 
0.1%
Other Symbol 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3823
11.6%
i 3036
 
9.2%
a 2790
 
8.5%
t 2590
 
7.9%
l 2371
 
7.2%
s 2253
 
6.8%
n 1805
 
5.5%
c 1732
 
5.3%
o 1728
 
5.2%
d 1588
 
4.8%
Other values (16) 9212
28.0%
Uppercase Letter
ValueCountFrequency (%)
H 889
 
9.2%
M 787
 
8.2%
O 779
 
8.1%
S 757
 
7.8%
T 737
 
7.6%
I 684
 
7.1%
E 635
 
6.6%
L 520
 
5.4%
G 514
 
5.3%
A 505
 
5.2%
Other values (16) 2840
29.4%
Other Punctuation
ValueCountFrequency (%)
. 460
50.7%
? 152
 
16.8%
/ 122
 
13.5%
% 69
 
7.6%
: 54
 
6.0%
& 21
 
2.3%
* 11
 
1.2%
' 8
 
0.9%
; 3
 
0.3%
# 3
 
0.3%
Other values (2) 4
 
0.4%
Decimal Number
ValueCountFrequency (%)
0 330
15.8%
9 308
14.7%
1 274
13.1%
2 257
12.3%
7 179
8.5%
8 160
7.6%
4 148
7.1%
5 147
7.0%
3 147
7.0%
6 145
6.9%
Math Symbol
ValueCountFrequency (%)
= 42
65.6%
< 11
 
17.2%
+ 9
 
14.1%
> 2
 
3.1%
Space Separator
ValueCountFrequency (%)
9583
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 394
100.0%
Close Punctuation
ValueCountFrequency (%)
) 78
100.0%
Open Punctuation
ValueCountFrequency (%)
( 74
100.0%
Other Symbol
ValueCountFrequency (%)
� 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 42575
76.3%
Common 13197
 
23.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3823
 
9.0%
i 3036
 
7.1%
a 2790
 
6.6%
t 2590
 
6.1%
l 2371
 
5.6%
s 2253
 
5.3%
n 1805
 
4.2%
c 1732
 
4.1%
o 1728
 
4.1%
d 1588
 
3.7%
Other values (42) 18859
44.3%
Common
ValueCountFrequency (%)
9583
72.6%
. 460
 
3.5%
- 394
 
3.0%
0 330
 
2.5%
9 308
 
2.3%
1 274
 
2.1%
2 257
 
1.9%
7 179
 
1.4%
8 160
 
1.2%
? 152
 
1.2%
Other values (21) 1100
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 55770
> 99.9%
Specials 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9583
 
17.2%
e 3823
 
6.9%
i 3036
 
5.4%
a 2790
 
5.0%
t 2590
 
4.6%
l 2371
 
4.3%
s 2253
 
4.0%
n 1805
 
3.2%
c 1732
 
3.1%
o 1728
 
3.1%
Other values (72) 24059
43.1%
Specials
ValueCountFrequency (%)
� 2
100.0%

Sample

HOLEIDPROJECTCODEGEOLFROMGEOLTOPRIORITYStrat_SumStratMj1Mj2Mj3Mj4Mj5Mn1Mn2Mn3Mn4Mn5Tr1Tr2Tr3Tr4Tr5Chip_pctShape1Shape2Max_DiaHardnessColourLithCommentOre_Texture
0CC0001CC0.02.01TaNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1CC0001CC0.01.01NaNNaNGHGOHONaNNaNMINaNNaNNaNNaNNaNNaNNaNNaNNaN50.0SRNaN20.0NaNNaNNaNNaN
2CC0001CC1.02.01NaNNaNGHGONaNNaNNaNMINaNNaNNaNNaNHSNaNNaNNaNNaN70.0SRNaN20.0NaNNaNNaNNaN
3CC0001CC2.03.01NaNNaNGOMHOGHNaNNaNMINaNNaNNaNNaNHSNaNNaNNaNNaN50.0SRNaN15.0NaNNaNNaNNaN
4CC0001CC2.04.01TdiNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
5CC0001CC3.04.01NaNNaNHOGONaNNaNNaNMIGHNaNNaNNaNHSKSNaNNaNNaN50.0SRNaN15.0NaNNaNNaNNaN
6CC0001CC4.06.01TdsNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
7CC0001CC4.05.01NaNNaNHOGHGONaNNaNMINaNNaNNaNNaNNaNNaNNaNNaNNaN60.0SRNaN20.0NaNNaNNaNNaN
8CC0001CC5.06.01NaNNaNGHGONaNNaNNaNMIHOOGNaNNaNNaNNaNNaNNaNNaN60.0SRNaN20.0NaNNaNNaNNaN
9CC0001CC6.07.01NaNNaNHOGHNaNNaNNaNMINaNNaNNaNNaNNaNNaNNaNNaNNaN80.0SRNaN25.0NaNNaNNaNNaN