Run Heavy AI Models for Free — No GPU Needed¶
Vllama has a built-in bridge to Kaggle's free GPU (NVIDIA T4, ~16GB VRAM). You submit a task from your laptop, it runs on Kaggle's infrastructure, and the output downloads back to your machine automatically.
This means you can run Stable Diffusion, text-to-video models, and 3D reconstruction — even on a 4GB RAM laptop — for free.
What You Need¶
- A Kaggle account (free, takes 2 minutes)
- Phone number for verification (Kaggle requires this for GPU access)
- Vllama installed:
pip install vllama
Step 1: Get Your Kaggle API Key¶
- Log in at kaggle.com
- Click your profile icon (top right) → Settings
- Scroll to the API section
- Click Create New Token — this downloads
kaggle.json
Open the file. It looks like:
Step 2: Connect Vllama to Kaggle¶
If your kaggle.json is already at ~/.kaggle/kaggle.json (default Kaggle CLI location), just run:
Step 3: Run Something¶
Now just add --service kaggle to any supported command.
Generate an Image¶
vllama run stabilityai/sd-turbo --service kaggle --prompt "A cyberpunk street at night, neon lights, rain"
Vllama will:
1. Create a Kaggle kernel with GPU
2. Install dependencies
3. Run the model
4. Download vllama_kaggle_<timestamp>.png to your current directory
Takes about 3–5 minutes total. The actual generation is under 10 seconds on the T4.
Generate a Video¶
vllama run_video damo-vilab/text-to-video-ms-1.7b --service kaggle --prompt "A sunset over the ocean, cinematic"
Generate a 3D Model from an Image¶
Troubleshooting¶
"Kaggle API credentials not found"
"GPU quota exceeded"
Kaggle gives ~30 GPU hours/week for free accounts. If you hit the limit, wait until your quota resets (weekly), or check quota at kaggle.com/account.
Kernel takes too long / times out
Kaggle notebooks have a 12-hour session limit. Individual inference tasks take 3–10 minutes, well within this.
First run is slow
The first run installs dependencies inside the Kaggle kernel. Subsequent runs reuse the environment and are faster.
Kaggle GPU Limits (Free Tier)¶
| Resource | Limit |
|---|---|
| GPU type | NVIDIA T4 (16GB VRAM) |
| Weekly GPU hours | ~30 hours |
| Session max duration | 12 hours |
| Disk space | 73GB |
| Internet | Enabled |
For most use cases — generating images, videos, 3D models — you'll stay well within the weekly quota.