`vllama train` — AutoML Training¶

Automatically trains and compares multiple ML models on preprocessed data with hyperparameter tuning, and produces a ranked leaderboard.

Syntax¶

vllama train --path <data_folder> --target <column>

Parameters¶

Parameter	Short	Description
`--path`	`-p`	Path to folder containing `train_data.csv` and `test_data.csv` (output from `vllama data`)
`--target`	`-t`	Name of the target column

What It Does¶

Auto-detects task type — classification or regression based on the target column
Trains all models with RandomizedSearchCV hyperparameter tuning on each
Evaluates every model on the test set with comprehensive metrics
Ranks models in a leaderboard
Saves all models and generates visualizations
Produces an HTML report with all results

Models Trained¶

ClassificationRegression

Logistic Regression
Random Forest
XGBoost
LightGBM
CatBoost
SVM
KNN
MLP (Neural Net)
Naive Bayes

Random Forest
XGBoost
LightGBM
CatBoost
SVR
KNN
MLP (Neural Net)

Examples¶

# Standard usage after vllama data
vllama train --path ./output_folder_20240101_120000 --target price

# Short form
vllama train -p ./output_folder_20240101_120000 -t label

Output Structure¶

results/
├── model_summary.csv           ← Leaderboard: all models ranked by metric
├── best_model.pkl              ← Best model, loadable with joblib
├── best_model.txt              ← Best model name and score
├── report.html                 ← Full interactive HTML report ← open this
└── per_model/
    ├── RandomForest/
    │   ├── RandomForest_best_model.pkl
    │   ├── RandomForest_tuning_results.csv
    │   ├── RandomForest_confusion_matrix.png  (classification)
    │   └── RandomForest_roc_curve.png         (binary classification)
    ├── XGBoost/
    │   └── ...
    └── ...

Open report.html

After training, open results/report.html in your browser. It contains the full leaderboard, per-model metrics, and all visualizations in one place.

Loading the Best Model¶

import joblib

model = joblib.load("results/best_model.pkl")
predictions = model.predict(X_test)

Full Pipeline¶

vllama data --path raw_data.csv --target price
vllama train --path ./output_folder_YYYYMMDD_HHMMSS --target price
# Open results/report.html

See the AutoML Pipeline Guide for a detailed walkthrough with a real dataset.

vllama train — AutoML Training¶