logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

avg-kwintes

Deploys a comprehensive self-hosted AI stack on VPS, supporting automation, monitoring, vector search, and speech-to-text capabilities. Integrates tools such as n8n, Ollama, Qdrant, Prometheus, and Grafana for efficient AI processing and maintenance.

Author

avg-kwintes logo

ThijsdeZeeuw

Apache License 2.0

Quick Info

GitHub GitHub Stars 1
NPM Weekly Downloads 0
Tools 1
Last Updated 2026-02-19

Tags

aimonitoringavghosted aiai stackavg kwintes

Local AI Stack for VPS Deployment

A comprehensive self-hosted AI stack designed for VPS deployment, featuring n8n, Ollama, Qdrant, Prometheus, Grafana, Whisper, and more.

Note: This project is based on work from coleam00/local-ai-packaged and Digitl-Alchemyst/Automation-Stack with customizations and improvements.

Features

  • n8n - Low-code automation platform with 400+ integrations
  • Ollama - Local LLM platform
  • Qdrant - High-performance vector store
  • Prometheus - Monitoring and alerting toolkit
  • Grafana - Metrics visualization and analytics
  • Whisper - Speech-to-text processing
  • Caddy - Automatic HTTPS/TLS
  • Supabase - Database and authentication
  • Flowise - AI agent builder
  • Open WebUI - ChatGPT-like interface
  • SearXNG - Privacy-focused search engine

Prerequisites

  • Ubuntu VPS (tested on Ubuntu 22.04 LTS)
  • Domain name with DNS access
  • Minimum 16GB RAM recommended
  • 100GB+ storage recommended
  • Docker installed (version 20.10.0 or later recommended)
  • Docker Compose installed:
  • Either Docker Compose plugin (docker compose)
  • Or standalone Docker Compose binary (docker-compose)

Note: The setup script will automatically detect whether to use docker compose or docker-compose based on what's available on your system.

Installation

  1. Connect to your VPS via SSH:
ssh root@your-vps-ip
  1. Install required packages:
sudo apt update && sudo apt install -y nano git docker.io python3 python3-pip docker-compose
  1. Configure firewall:
sudo ufw enable
sudo ufw allow 5678  # n8n (using port 5678 to avoid conflict with Supabase)
sudo ufw allow 3001  # Flowise
sudo ufw allow 8080  # Open WebUI
sudo ufw allow 3000  # Grafana
sudo ufw allow 80    # HTTP
sudo ufw allow 443   # HTTPS
sudo ufw allow 8000  # Supabase API (Kong)
sudo ufw allow 11434 # Ollama
sudo ufw allow 6333  # Qdrant
sudo ufw allow 9090  # Prometheus
sudo ufw allow 54321 # Supabase Studio
sudo ufw reload
  1. Clone the repository:
git clone https://github.com/ThijsdeZeeuw/avg-kwintes.git
cd avg-kwintes
  1. Run the configuration script to prepare the environment:
# Make the script executable
chmod +x fix_config.sh

# Run the configuration script
sudo ./fix_config.sh

The configuration script will: - Check and install all necessary system dependencies - Configure the correct firewall rules for all services - Detect and resolve any port conflicts - Generate utility scripts for maintenance - Create a basic .env file if one doesn't exist

  1. Run the interactive setup to complete configuration:
python3 start_services.py --interactive
  1. Start the services:
python3 start_services.py --profile cpu

This sequence ensures that everything is properly configured before starting the services, avoiding port conflicts and other setup issues.

Utility Scripts

The configuration process creates several helpful utility scripts:

Update Script

To update your Local AI Stack to the latest version:

sudo ./update_stack.sh

This script will pull the latest Docker images, apply necessary configuration fixes, and restart all services.

Backup Script

To create a complete backup of your Local AI Stack data:

sudo ./backup_stack.sh

This script will back up all Docker volumes, configuration files, and secrets to a timestamped archive.

Ollama Models

The following models are automatically installed and available in the system:

Large Language Models (LLMs)

Model Source Description
gemma3:12b Google A 12B parameter model from Google's Gemma family, optimized for general text understanding and generation
granite3-guardian:8b IBM An 8B parameter model focused on safety and ethical considerations in AI interactions
granite3.1-dense:latest IBM Latest version of IBM's dense transformer model for general language tasks
granite3.1-moe:3b IBM A 3B parameter mixture-of-experts model optimized for efficient inference
granite3.2:latest IBM Latest version of IBM's advanced language model with improved capabilities
llama3.2-vision Meta A multimodal model capable of understanding both text and images
minicpm-v:8b OpenBMB A compact 8B parameter model optimized for efficient deployment
mistral-nemo:12b Mistral AI A 12B parameter model based on Mistral's architecture with enhanced capabilities
qwen2.5:7b-instruct-q4_K_M Alibaba A quantized 7B parameter instruction-tuned model optimized for efficiency
reader-lm:latest OpenBMB A specialized model for document understanding and question answering

Embedding Models

Model Source Description
granite-embedding:278m IBM A compact embedding model for efficient text vectorization
jeffh/intfloat-multilingual-e5-large-instruct:f16 Hugging Face A multilingual embedding model optimized for instruction following
nomic-embed-text:latest Nomic AI A general-purpose text embedding model for semantic search and similarity

These models are automatically downloaded during the initial setup process. The system supports both CPU and GPU (NVIDIA/AMD) inference depending on your hardware configuration.

Accessing Services

After installation, you can access the following services:

  • n8n: https://n8n.kwintes.cloud
  • Web UI: https://openwebui.kwintes.cloud
  • Flowise: https://flowise.kwintes.cloud
  • Supabase: https://supabase.kwintes.cloud
  • Supabase Studio: http://localhost:54321 or https://studio.supabase.kwintes.cloud
  • Grafana: https://grafana.kwintes.cloud
  • Prometheus: https://prometheus.kwintes.cloud
  • Whisper API: https://whisper.kwintes.cloud
  • Qdrant API: https://qdrant.kwintes.cloud

Monitoring

The stack includes comprehensive monitoring:

  1. Access Grafana at https://grafana.kwintes.cloud
  2. Default credentials: admin / (password from secrets.txt)
  3. Add Prometheus as a data source (URL: http://prometheus:9090)

  4. Access Prometheus at https://prometheus.kwintes.cloud

  5. View metrics and create alerts

Security Notes

  1. All secrets are saved to secrets.txt - keep this file secure
  2. All services are configured to use HTTPS through Caddy
  3. Firewall rules are configured to allow only necessary ports
  4. Default credentials should be changed after first login

Maintenance

To update the stack:

cd local-ai-packaged
git pull
python3 start_services.py --profile cpu

To restart services:

docker compose -p localai down
python3 start_services.py --profile cpu

Troubleshooting

  1. Docker Compose Issues

If you encounter errors with Docker Compose commands like: unknown shorthand flag: 'p' in -p This indicates incompatibility between the command format and your Docker Compose version.

Solution: The script now automatically detects and uses the correct Docker Compose command format for your system. If you're manually running commands, use: - For Docker Compose plugin: docker compose -p localai ... - For standalone binary: docker-compose -p localai ...

If neither works, install the standalone Docker Compose binary: bash sudo curl -L "https://github.com/docker/compose/releases/download/v2.24.5/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose sudo chmod +x /usr/local/bin/docker-compose

  1. Check service logs:
docker compose -p localai logs -f [service_name]
# or
docker-compose -p localai logs -f [service_name]
  1. Verify service status:
docker compose -p localai ps
# or
docker-compose -p localai ps
  1. Check monitoring:
  2. Visit Grafana dashboard
  3. Check Prometheus targets
  4. Review service health endpoints

  5. Service Restart:

# Restart all services
docker compose down
docker compose up -d

Support

For issues and feature requests, please open an issue on the GitHub repository.


Created and maintained by Z4Y

Security Features

This setup prioritizes security through multiple layers:

  1. Local Deployment
  2. All AI models run locally on your VPS
  3. No data is sent to external AI services
  4. Complete control over data privacy and security

  5. Secure Infrastructure

  6. Automatic HTTPS/TLS encryption via Caddy
  7. Firewall rules limiting access to necessary ports
  8. Secure secret management with environment variables
  9. Regular security updates through Docker containers

  10. Access Control

  11. Supabase authentication for user management
  12. Role-based access control
  13. Audit logging for all system activities
  14. Secure API endpoints with authentication

  15. Data Protection

  16. Local vector database (Qdrant) for secure document storage
  17. Encrypted communication between services
  18. No external API dependencies for core functionality
  19. Regular backup capabilities

Local AI Capabilities

The system leverages powerful local models for various tasks:

Text Processing

  • Document summarization and analysis
  • Multi-language support (via multilingual models)
  • Question answering and information extraction
  • Text classification and sentiment analysis

Vision Capabilities

  • Image analysis and description
  • Document scanning and text extraction
  • Visual understanding and reasoning
  • Accessibility features for visual content

Example Use Cases

  1. Document Analysis python # Example: Analyzing client reports input_text = "Client report from session..." model = "qwen2.5:7b-instruct-q4_K_M" # Process and analyze the report locally

  2. Multi-language Support python # Example: Processing documents in multiple languages text = "Document in Dutch..." model = "jeffh/intfloat-multilingual-e5-large-instruct:f16" # Process multilingual content

  3. Visual Document Processing python # Example: Analyzing scanned documents image = "scanned_report.jpg" model = "llama3.2-vision" # Extract and analyze visual content

GGZ/FBW Client Support

This system is particularly valuable for GGZ (Mental Healthcare) and FBW (Forensic Protected Living) organizations:

Document Generation and Analysis

  1. Client Report Generation
  2. Automatically generate structured reports from session notes
  3. Maintain consistent documentation standards
  4. Support multiple languages for diverse client populations
  5. Ensure privacy by processing all data locally

  6. Treatment Plan Analysis

  7. Analyze treatment plans for completeness and consistency
  8. Identify potential gaps in documentation
  9. Suggest improvements based on best practices
  10. Track progress over time

  11. Risk Assessment Support

  12. Process and analyze risk assessment documents
  13. Identify patterns and trends in risk factors
  14. Generate structured risk reports
  15. Support evidence-based decision making

Client Understanding and Support

  1. Communication Analysis
  2. Process and analyze client communications
  3. Identify key themes and concerns
  4. Support multilingual communication
  5. Track changes in client status over time

  6. Documentation Quality

  7. Ensure consistent documentation standards
  8. Identify missing or incomplete information
  9. Suggest improvements in documentation
  10. Support quality assurance processes

  11. Knowledge Management

  12. Create searchable knowledge bases from client documents
  13. Support evidence-based practice
  14. Enable quick access to relevant information
  15. Maintain privacy and security of sensitive data

Benefits for GGZ/FBW Organizations

  1. Privacy and Compliance
  2. All processing happens locally
  3. No external data transmission
  4. Compliant with healthcare privacy regulations
  5. Full control over data security

  6. Efficiency Improvements

  7. Automated document processing
  8. Reduced administrative burden
  9. Faster access to relevant information
  10. Support for evidence-based practice

  11. Quality Enhancement

  12. Consistent documentation standards
  13. Improved risk assessment
  14. Better tracking of client progress
  15. Enhanced decision support

Port Configuration

To avoid port conflicts between services, we've set up consistent port mappings:

sudo ufw enable
sudo ufw allow 5678  # n8n (using port 5678 to avoid conflict with Supabase)
sudo ufw allow 3001  # Flowise
sudo ufw allow 8080  # Open WebUI
sudo ufw allow 3000  # Grafana
sudo ufw allow 80    # HTTP
sudo ufw allow 443   # HTTPS
sudo ufw allow 8000  # Supabase API (Kong)
sudo ufw allow 11434 # Ollama
sudo ufw allow 6333  # Qdrant
sudo ufw allow 9090  # Prometheus
sudo ufw allow 54321 # Supabase Studio
sudo ufw reload

Key points about our port configuration: 1. n8n uses port 5678 instead of 8000 to avoid conflicts with Supabase 2. Each service uses consistent internal and external port mappings 3. Port settings are handled automatically by the setup scripts

See Also

`