Classification Mushroom Data 2020
The Classification Mushroom Data 2020 provides a comprehensive overview of mushroom species, focusing on their physical characteristics and classification as either poisonous or edible.
Classification Mushroom Data 2020
Data Set Description
- Primary Data: Describes 173 mushroom species, used for simulating hypothetical mushrooms.
- Secondary Data: Contains 61,069 hypothetical mushrooms for binary classification.
Source code and data are available for download under CC BY 4.0 Licensing.
Data Set Characteristics
Characteristic | Details |
---|---|
Type | Multivariate |
Number of Instances | 173 (Primary), 61,069 (Secondary) |
Number of Variables | 20 |
Attribute Characteristics | Qualitative and Quantitative |
Date Published | 15.10.2020 |
Missing Values | Yes (Primary), No (Secondary) |
Primary Data Set Characteristics
Characteristic | Details |
---|---|
Type | Multivariate |
Number of Instances | 173 |
Number of Variables | 20 |
Attribute Characteristics | Qualitative and Quantitative |
Date Published | 15.10.2020 |
Associated Tasks | Data simulation |
Missing Values | Yes |
Metadata Download | Download Metadata |
Secondary Data Set Characteristics
Characteristic | Details |
---|---|
Type | Multivariate |
Number of Instances | 61,069 |
Number of Variables | 20 |
Attribute Characteristics | Qualitative and Quantitative |
Date Published | 15.10.2020 |
Associated Tasks | Binary classification |
Missing Values | No |
Metadata Download | Download Metadata |
Comparison with UCI 1987
The Mushroom Data 2020 provides a modernized approach to mushroom classification, building on the foundational UCI 1987 data set. While the UCI 1987 dataset focused on a limited number of species with predefined attributes, the 2020 dataset expands significantly in terms of species diversity and number of hypothetical instances. This expansion allows for more robust machine learning applications and provides a comprehensive testbed for binary classification tasks. For more details, please see the Results section of the publication.
Publication
Wagner, D., Heider, D., & Hattab, G. (2021). Mushroom data creation, curation, and simulation to support classification tasks. Scientific Reports, 11, 8134. 10.1038/s41598-021-87602-3.
Data Variables
Variable | Measurement | Values |
---|---|---|
cap-diameter | Quantitative | Float number in cm |
cap-shape | Qualitative | bell=b, conical=c, convex=x, flat=f, sunken=s, spherical=p, others=o |
cap-surface | Qualitative | fibrous=i, grooves=g, scaly=y, smooth=s, shiny=h, leathery=l, silky=k, sticky=t, wrinkled=w, fleshy=e |
cap-color | Qualitative | brown=n, buff=b, gray=g, green=r, pink=p, purple=u, red=e, white=w, yellow=y, blue=l, orange=o, black=k |
does-bruise-bleed | Qualitative | bruises-or-bleeding=t, no=f |
gill-attachment | Qualitative | adnate=a, adnexed=x, decurrent=d, free=e, sinuate=s, pores=p, unknown=? |
gill-spacing | Qualitative | close=c, distant=d, none=f |
gill-color | Qualitative | see cap-color |
stem-height | Quantitative | Float number in cm |
stem-width | Quantitative | Float number in mm |
stem-root | Qualitative | bulbous=b, swollen=s, club=c, cup=u, equal=e, rhizomorphs=z, rooted=r |
stem-surface | Qualitative | see cap-surface |
stem-color | Qualitative | see cap-color |
veil-type | Qualitative | partial=p, universal=u |
veil-color | Qualitative | see cap-color |
has-ring | Qualitative | ring=t |
ring-type | Qualitative | cobwebby=c, evanescent=e, flaring=r, grooved=g, large=l, pendant=p, sheathing=s, zone=z, scaly=y, movable=m, none=f, unknown=? |
spore-print-color | Qualitative | see cap color |
habitat | Qualitative | grasses=g, leaves=l, meadows=m, paths=p, heaths=h, urban=u, waste=w, woods=d |
season | Qualitative | spring=s, summer=u, autumn=a, winter=w |
Licensing
All source code and data are open-source and available for modification under the Creative Commons License CC BY 4.0.