Zac Stryker Home Projects Blog Resume About Contact GitHub
← Projects

Mushroom Classification

Full ML pipeline — edible vs poisonous — EDA, feature engineering, scaling, 4 classifiers with confusion matrix, ROC & PR curves, and feature importance

 Training model…
Accuracy ? Fraction of all predictions that were correct. With balanced classes this is a reliable overall metric. Ranges from 0 to 1.
Overall correct / total
Precision ? Of all mushrooms predicted poisonous, what fraction actually were? High precision means fewer false alarms (safe to eat flagged as poisonous).
True pos / (True pos + False pos)
Recall ? Of all truly poisonous mushrooms, what fraction did we catch? High recall means fewer misses (poisonous mushrooms labeled as edible — the dangerous mistake).
True pos / (True pos + False neg)
F1 Score ? Harmonic mean of Precision and Recall. A single balanced measure — useful when you care about both false positives and false negatives.
2 × Precision × Recall / (P+R)
AUC-ROC ? Area Under the ROC Curve. Probability that the model ranks a random poisonous mushroom above a random edible one. 1.0 = perfect, 0.5 = random.
1.0 = perfect ranking
Confusion Matrix ? True Edible (top-left) and True Poisonous (bottom-right) are correct predictions. Off-diagonals show errors — predicting poisonous as edible (False Negative) is the dangerous case.
Confusion Matrix
Feature Importance ? For tree models: mean decrease in impurity across all splits. For Logistic Regression: normalized absolute coefficient. Higher = more influence on the prediction.
ROC Curve ? Receiver Operating Characteristic — plots True Positive Rate vs False Positive Rate at every threshold. The dashed line is a random classifier. Larger area = better model.
ROC Curve
Precision-Recall Curve ? Plots Precision vs Recall at every classification threshold. Average Precision (AP) summarizes the curve. More informative than ROC when classes are imbalanced.
PR Curve
Model Accuracy Comparison ? Accuracy and F1 score for all four classifiers trained on the same train/test split. The mushroom dataset is nearly linearly separable (odor alone achieves ~98%), so most models perform near-perfectly.
 Training all models…
Exploratory Data Analysis
 Generating EDA plots…

README.md

End-to-end binary classification pipeline on the UCI Mushroom dataset (8,124 samples, 22 categorical features). Predicts whether a mushroom is edible or poisonous. Switch models with the toggle above — each model is trained once and cached for instant replay. EDA plots and the model comparison chart load in parallel.

ML Pipeline

  1. Load data — read mushrooms.csv (8,124 × 23 columns)
  2. Feature engineering — replace '?' (missing stalk-root) with NaN; impute with column mode
  3. Label encodingLabelEncoder applied to all 22 categorical features + target (e→0, p→1)
  4. Train / test split — 80 / 20 stratified split, random_state=42
  5. Scale XStandardScaler fit on train, applied to both sets
  6. Fit classifier — one of 4 algorithms; evaluate on held-out test set

Models

  • Random Forest — 100 trees, random_state=42
  • Gradient Boosting — 100 estimators, random_state=42
  • Logistic Regression — L2 penalty, max_iter=1000
  • Decision Tree — unpruned, random_state=42

Charts

  • Confusion Matrix — seaborn heatmap with True/Predicted class labels
  • ROC Curve — FPR vs TPR with AUC annotation and fill
  • PR Curve — Precision vs Recall with Average Precision annotation
  • Feature Importance — top-10 features (tree impurity or |coef| for LR)
  • Model Comparison — grouped bar chart of accuracy & F1 for all 4 models

EDA

  • Class balance & overview — edible vs poisonous counts, unique values per feature
  • Missing valuesstalk-root has 2,480 missing values (30.5%)
  • Feature analysis — stacked distribution of top 9 features by class
  • Bivariate analysis — top 6 features ranked by Cramér's V with class label
  • Correlation heatmap — Cramér's V matrix for all 22 categorical features

Tech Stack

  • scikit-learn — classifiers, preprocessing, metrics
  • scipychi2_contingency for Cramér's V
  • seaborn / matplotlib — confusion matrix, ROC, PR curve, EDA plots
  • pandas / numpy — data loading and feature engineering
  • Chart.js — feature importance and model comparison charts
  • Flask — serves the page with /run, /compare, and /plots API endpoints