Documentation Index
Fetch the complete documentation index at: https://docs.namastex.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Run powerful AI coding agents completely local - no API keys, no cloud services, no data leaving your machine. Perfect for privacy-sensitive work, learning, or unlimited free usage.Privacy First: All code stays on your machine. No external API calls.
Supported Models
OpenCode
OpenCode
Open-source coding model
- By: Open-source community
- Size: 15B parameters
- Context: 16K tokens
- Requirements: 16GB+ RAM, GPU recommended
- License: Apache 2.0
Qwen Code
Qwen Code
Alibaba’s open-source coder
- By: Alibaba DAMO Academy
- Size: 7B, 14B, 32B parameters
- Context: 32K tokens
- Requirements: 8GB-64GB RAM depending on size
- License: Apache 2.0
Quick Start with Ollama
Ollama makes running local models easy:Start Ollama
Hardware Requirements
Minimum Specs
| Model | RAM | GPU | Storage | Speed |
|---|---|---|---|---|
| Qwen 7B | 8GB | Optional | 5GB | Good |
| Qwen 14B | 16GB | Recommended | 10GB | Better |
| OpenCode | 16GB | Recommended | 12GB | Good |
| Qwen 32B | 32GB | Required | 25GB | Best |
Recommended Setup
For best experience:Configuration
Basic Ollama Setup
Advanced Configuration
GPU Acceleration
Enable GPU support for faster inference:Strengths of Local Models
Complete Privacy
No Data Leakage
- Code never leaves your machine
- No API calls to external services
- No telemetry or tracking
- Perfect for sensitive codebases
Compliance-Ready
- GDPR compliant by design
- No third-party data sharing
- Full audit trail
- Meets enterprise security requirements
No Usage Limits
No Internet Required
Work anywhere:- ✈️ On airplanes
- 🏔️ Remote locations
- 🔌 During outages
- 🔒 Air-gapped environments
Limitations
Lower Quality
Local models are less capable than cloud models:| Task Type | Local | Claude | GPT-4 |
|---|---|---|---|
| Simple fixes | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| Architecture | ⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| Bug fixing | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| Testing | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
Slower
Hardware Intensive
- Requires powerful machine
- GPU strongly recommended
- High RAM usage
- Slower on CPU-only
Best Use Cases
Privacy-Sensitive Work
Learning & Experimentation
Air-Gapped Environments
Cost Reduction
Model Comparison
OpenCode vs Qwen
| Feature | OpenCode | Qwen 7B | Qwen 14B | Qwen 32B |
|---|---|---|---|---|
| Quality | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| Speed | ⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐ |
| RAM | 16GB | 8GB | 16GB | 32GB |
| Best for | General | Quick tasks | Balanced | Quality |
Local vs Cloud
| Feature | Local (Qwen 32B) | Claude Sonnet | Gemini Pro |
|---|---|---|---|
| Privacy | ⭐⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐ |
| Quality | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| Speed | ⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| Cost | $$$ hardware, $0 usage | $0 hardware, $$$ usage | $0 hardware, $$ usage |
Performance Optimization
Use GPU
Adjust Context Window
Use Smaller Models for Simple Tasks
Troubleshooting
Ollama not running
Ollama not running
Error: “Connection refused to localhost:11434”Solution:
Out of memory
Out of memory
Error: “Failed to allocate memory”Solutions:
- Use smaller model (7B instead of 32B)
- Close other applications
- Enable low VRAM mode:
- Upgrade RAM
Very slow inference
Very slow inference
Issue: Model taking foreverSolutions:
- Enable GPU acceleration
- Use smaller model
- Reduce context window
- Close background apps
- Check CPU usage (should be high)
Model not found
Model not found
Error: “Model ‘qwen2.5-coder:32b’ not found”Solution:
Cost Analysis
Hardware Investment
Ongoing Costs
Best Practices
Use for Sensitive Work
Start Small
Begin with Qwen 7B:Upgrade to 32B if needed
Monitor Resources
Combine with Cloud
Hybrid Strategy
Combine local and cloud models:Strategy 1: Privacy Tiers
Strategy 2: Cost Optimization
Strategy 3: Network-Aware
Real-World Example
Setup for Privacy-First Development
Next Steps
Install Ollama
Get started with local models
Other Agents
Compare with cloud agents
Privacy Workflows
Privacy-first development patterns
Ollama Library
Browse available models

