Incus Cluster: Setup Production-Grade VPS Infrastructure dengan MicroOVN & MicroCeph
Tutorial lengkap membangun Incus cluster untuk VPS hosting production. Setup multi-node dengan MicroOVN networking dan MicroCeph distributed storage - dari instalasi hingga maintenance.

Membangun VPS Infrastructure dengan Incus Cluster
Kamu sudah mahir menjalankan Incus di single machine. Sekarang saatnya naik level: membangun cluster production-grade untuk melayani puluhan hingga ratusan customer VPS.
Tutorial ini akan memandu kamu membangun infrastructure seperti yang kami gunakan di Dalang.io - dengan redundancy, high availability, dan skalabilitas.
Arsitektur Overview
Target Architecture
ββββββββββββββββββββββββββββββββββββββββββββββββ
β INCUS CLUSTER β
β β
βββββββββββ β βββββββββββ βββββββββββ βββββββββββ β
β Client βββββββββββββ Node 1 ββββ Node 2 ββββ Node 3 β β
β (Admin) β β β (Leader)β β β β β β
βββββββββββ β ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ β
β β β β β
β ββββββ΄βββββββββββββ΄βββββββββββββ΄βββββ β
β β MicroOVN Network β β
β β (Software-Defined Networking) β β
β ββββββ¬βββββββββββββ¬βββββββββββββ¬βββββ β
β β β β β
β ββββββ΄βββββββββββββ΄βββββββββββββ΄βββββ β
β β MicroCeph Storage β β
β β (Distributed Block Storage) β β
β βββββββββββββββββββββββββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββ Komponen Utama
| Komponen | Fungsi |
|---|---|
| Incus Cluster | Container/VM orchestration |
| MicroOVN | Software-defined networking (SDN) |
| MicroCeph | Distributed storage (Ceph) |
| OVN Router | Gateway untuk VM network |
Example Cluster Specification
| Node | IP | CPU | RAM | Ceph Disks | Role |
|---|---|---|---|---|---|
| node-01 | 10.0.0.253 | 24c | 128 GiB | 3x 1.8T HDD | database-leader |
| node-02 | 10.0.0.252 | 24c | 128 GiB | 2x 1.8T HDD | database |
| node-03 | 10.0.0.250 | 24c | 128 GiB | 4x 1.8T HDD | database |
| node-04 | 10.0.0.249 | 24c | 128 GiB | 1x 120G NVMe | standby |
| node-05 | 10.0.0.248 | 24c | 128 GiB | 1x 120G NVMe | standby |
Part 1: Prerequisites & Planning
Hardware Requirements
Minimum per Node:
- CPU: 8 cores (untuk VM support)
- RAM: 32 GB
- Storage: 1x SSD untuk OS, 1x disk untuk Ceph OSD
- Network: 2x 1GbE (management + storage)
Recommended per Node:
- CPU: 24+ cores
- RAM: 128+ GB
- Storage: NVMe untuk OS, multiple HDD/SSD untuk Ceph
- Network: 10GbE untuk storage traffic
Network Planning
Management Network: 10.0.0.0/24
βββ node-01: 10.0.0.253
βββ node-02: 10.0.0.252
βββ node-03: 10.0.0.250
βββ node-04: 10.0.0.249
βββ node-05: 10.0.0.248
OVN Provider Network: 10.69.0.0/24
βββ OVN Router External: 10.69.0.100
VM Internal Network: 10.70.0.0/24
βββ OVN Router Internal: 10.70.0.1 (gateway)
βββ VMs: 10.70.0.2-254 OS Preparation (All Nodes)
# Update system
apt update && apt upgrade -y
# Set hostname
hostnamectl set-hostname node-01 # sesuaikan per node
# Edit /etc/hosts (semua node)
cat >> /etc/hosts << EOF
10.0.0.253 node-01
10.0.0.252 node-02
10.0.0.250 node-03
10.0.0.249 node-04
10.0.0.248 node-05
EOF
# Disable swap (required untuk Ceph)
swapoff -a
sed -i '/swap/d' /etc/fstab
# Enable IP forwarding
cat >> /etc/sysctl.conf << EOF
net.ipv4.ip_forward=1
net.ipv6.conf.all.forwarding=1
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
EOF
sysctl -p
# Load kernel modules
modprobe br_netfilter
modprobe overlay
echo "br_netfilter" >> /etc/modules
echo "overlay" >> /etc/modules Part 2: Install MicroCeph (Distributed Storage)
MicroCeph adalah Ceph yang di-package dalam snap untuk kemudahan deployment.
Install MicroCeph (All Nodes)
# Install snap jika belum ada
apt install snapd -y
# Install MicroCeph
snap install microceph --channel=latest/stable
# Verify installation
microceph version Bootstrap Cluster (Node 1 Only)
# Bootstrap pada node pertama
microceph cluster bootstrap
# Cek status
microceph status Join Cluster (Node 2, 3, dst)
Di Node 1 (Leader):
# Generate join token
microceph cluster add node-02
# Output: eyJuYW1lIjoibm9kZS0wMiIsInNlY3Jl...
microceph cluster add node-03
# Catat token untuk masing-masing node Di Node 2, 3 (dan seterusnya):
# Join cluster
microceph cluster join <token>
# Verify
microceph status Add Ceph OSDs (All Nodes)
Setiap node perlu menambahkan disk untuk Ceph OSD:
# List available disks
lsblk
# Identify disks yang akan dijadikan OSD (HATI-HATI!)
# Contoh: /dev/sdb, /dev/sdc adalah disk kosong untuk Ceph
# Add disk sebagai OSD
microceph disk add /dev/sdb --wipe
microceph disk add /dev/sdc --wipe
# Ulangi untuk semua disk yang tersedia
# Verify OSDs
sudo ceph osd tree Verify Ceph Cluster
# Status cluster
sudo ceph status
# Output yang sehat:
# cluster:
# id: xxxx-xxxx-xxxx
# health: HEALTH_OK
#
# services:
# mon: 3 daemons
# osd: 9 osds: 9 up, 9 in
#
# data:
# pools: 1 pools, 1 pgs
# objects: 0 objects, 0 B
# usage: x GiB used, y TiB avail
# Detail OSD
sudo ceph osd df
# Pool status
sudo ceph df Create Ceph Pool untuk Incus
# Buat RBD pool untuk Incus
sudo ceph osd pool create incus 64 64
sudo ceph osd pool set incus size 3 # 3x replication
sudo rbd pool init incus
# Verify
sudo ceph osd pool ls detail Part 3: Install MicroOVN (Software-Defined Networking)
MicroOVN menyediakan OVN (Open Virtual Network) untuk networking yang scalable.
Install MicroOVN (All Nodes)
# Install MicroOVN snap
snap install microovn --channel=latest/stable
# Verify
microovn version Bootstrap MicroOVN Cluster (Node 1)
# Bootstrap
microovn cluster bootstrap
# Status
microovn status Join MicroOVN Cluster (Node 2, 3, dst)
Di Node 1:
# Generate token
microovn cluster add node-02
microovn cluster add node-03 Di Node 2, 3:
# Join
microovn cluster join <token>
# Verify
microovn status Setup OVS Socket Symlink (CRITICAL!)
Incus membutuhkan akses ke OVS socket. Karena MicroOVN menggunakan snap, path socket berbeda:
# Buat symlink di SEMUA nodes
mkdir -p /run/openvswitch
ln -sf /var/snap/microovn/common/run/switch/db.sock /run/openvswitch/db.sock
# Jadikan persistent via rc.local
cat > /etc/rc.local << 'EOF'
#!/bin/bash
mkdir -p /run/openvswitch
ln -sf /var/snap/microovn/common/run/switch/db.sock /run/openvswitch/db.sock
exit 0
EOF
chmod +x /etc/rc.local Verify MicroOVN
# Status semua services
snap services microovn
# OVN Northbound DB
sudo ovn-nbctl show
# OVN Southbound DB
sudo ovn-sbctl show
# OVS bridges
sudo ovs-vsctl show Part 4: Install & Configure Incus Cluster
Install Incus (All Nodes)
# Via snap (recommended)
snap install incus --channel=latest/stable
# Add user to group
usermod -aG incus-admin $USER Bootstrap Incus Cluster (Node 1)
# Interactive init
incus admin init
# Jawaban untuk clustering:
# Would you like to use clustering? yes
# What IP or DNS name should be used to reach this server? 10.0.0.253
# Are you joining an existing cluster? no
# What member name should be used to identify this server? node-01
# Do you want to configure a new local storage pool? no # (akan pakai Ceph)
# Do you want to configure a new remote storage pool? yes
# Name: ceph
# Driver: ceph
# ceph.osd.pg_num: 64
# ceph.cluster_name: ceph
# ceph.user.name: admin
# source: incus
# Would you like to connect to a MAAS server? no
# Would you like to configure Incus to use an existing bridge? no
# Would you like to create a new Fan overlay network? no
# Would you like to configure Incus to use an existing network? no
# What should the new bridge be called? incusbr0
# IPv4 address: 10.10.10.1/24
# IPv6: none
# Would you like the server to be available over network? yes
# Address: [::]
# Port: 8443
# Trust password: <set-password> Join Nodes ke Incus Cluster
Di Node 1:
# Generate join token
incus cluster add node-02
# Output token
incus cluster add node-03
# Output token Di Node 2, 3:
incus admin init
# Jawaban:
# Would you like to use clustering? yes
# IP address: 10.0.0.252 # (sesuaikan)
# Are you joining an existing cluster? yes
# Please provide join token: <paste-token> Verify Incus Cluster
# List cluster members
incus cluster list
# Output:
# +----------+----------------------------+------------------+--------------+----------------+
# | NAME | URL | ROLES | ARCHITECTURE | FAILURE DOMAIN |
# +----------+----------------------------+------------------+--------------+----------------+
# | node-01 | https://10.0.0.253:8443 | database-leader | x86_64 | default |
# | node-02 | https://10.0.0.252:8443 | database | x86_64 | default |
# | node-03 | https://10.0.0.250:8443 | database | x86_64 | default |
# +----------+----------------------------+------------------+--------------+----------------+
# Cek storage pool Ceph
incus storage list
incus storage info ceph Part 5: Configure OVN Network
Create OVN Network
# Buat uplink network (provider network)
incus network create ovn-uplink
--type=physical
parent=ens18
ipv4.ovn.ranges=10.69.0.10-10.69.0.99
ipv4.gateway=10.69.0.1/24
dns.nameservers=8.8.8.8
# Buat OVN network untuk VM
incus network create ovn1
--type=ovn
network=ovn-uplink
ipv4.address=10.70.0.1/24
ipv4.nat=true
ipv6.address=none Verify OVN Network
# List networks
incus network list
# Info OVN network
incus network info ovn1
# Cek OVN router di Northbound DB
sudo ovn-nbctl show
# Output akan menunjukkan:
# switch <uuid> (incus-net-ovn1)
# port incus-net-ovn1-<instance>-eth0
# addresses: ["xx:xx:xx:xx:xx:xx 10.70.0.x"]
# router <uuid> (incus-net-ovn1-lr)
# port incus-net-ovn1-lr-lrp-ext
# mac: "xx:xx:xx:xx:xx:xx"
# networks: ["10.69.0.100/24"]
# port lrp-incus-net-ovn1
# mac: "xx:xx:xx:xx:xx:xx"
# networks: ["10.70.0.1/24"] Create Profile untuk OVN
# Buat profile yang menggunakan OVN dan Ceph
incus profile create vps-ovn
incus profile edit vps-ovn config:
boot.autostart: "true"
limits.cpu: "2"
limits.memory: 4GiB
security.nesting: "false"
description: VPS with OVN networking
devices:
eth0:
name: eth0
network: ovn1
type: nic
root:
path: /
pool: ceph
size: 20GiB
type: disk
name: vps-ovn Setup Linux Bridge untuk Incus-OVN Connection
MicroOVN perlu bridge untuk koneksi ke Incus. Ini dilakukan SEKALI per node:
# Buat bridge
ip link add name incusovn1 type bridge
ip link set incusovn1 up
# Buat veth pair untuk koneksi ke OVS
ip link add incusovn1a type veth peer name incusovn1b
ip link set incusovn1a master incusovn1
ip link set incusovn1a up
ip link set incusovn1b up
# Connect ke OVS (akan dilakukan otomatis oleh Incus) Buat persistent dengan script di /etc/rc.local:
#!/bin/bash
# OVS socket symlink
mkdir -p /run/openvswitch
ln -sf /var/snap/microovn/common/run/switch/db.sock /run/openvswitch/db.sock
# Incus OVN bridge
ip link add name incusovn1 type bridge 2>/dev/null || true
ip link set incusovn1 up
ip link add incusovn1a type veth peer name incusovn1b 2>/dev/null || true
ip link set incusovn1a master incusovn1
ip link set incusovn1a up
ip link set incusovn1b up
exit 0 Part 6: Deploy & Test Instance
Launch Test VM
# Launch VM dengan OVN network
incus launch images:ubuntu/24.04 test-vm --vm --profile vps-ovn
# Atau targetkan ke node tertentu
incus launch images:ubuntu/24.04 test-vm --vm --profile vps-ovn --target node-02
# Cek status
incus list
# Output:
# +---------+---------+----------------------+------+-----------+-----------+
# | NAME | STATE | IPV4 | IPV6 | TYPE | LOCATION |
# +---------+---------+----------------------+------+-----------+-----------+
# | test-vm | RUNNING | 10.70.0.129 (enp5s0) | | VIRTUAL-MACHINE | node-02 |
# +---------+---------+----------------------+------+-----------+-----------+ Test Connectivity
# Dari node cluster
ping 10.70.0.129
# SSH ke VM
incus exec test-vm -- bash
# Dari dalam VM, test internet
ping 8.8.8.8
curl -I https://google.com Setup Routing dari External Machine
Jika kamu ingin akses VM dari mesin di luar cluster:
# Di mesin external (misal: proxy server)
# Route ke provider network via gateway node
ip route add 10.69.0.0/24 via 10.0.0.248 # via node yang jadi OVN gateway
# Route ke VM network via OVN router
ip route add 10.70.0.0/24 via 10.0.0.248
# Di gateway node, pastikan route ke VM network
ip route add 10.70.0.0/24 via 10.69.0.100 # via OVN router Live Migration Test
# Pindah VM dari node-02 ke node-03
incus move test-vm --target node-03
# Cek lokasi baru
incus list test-vm
# VM tetap running selama migrasi (live migration) Part 7: Debug & Troubleshooting Cluster
Incus Cluster Debugging
Cluster Status Issues:
# Cek cluster health
incus cluster list
# Jika node offline
incus cluster show node-02
# Force evacuate instances dari node bermasalah
incus cluster evacuate node-02
# Remove node yang mati (HATI-HATI!)
incus cluster remove node-02 --force Database Issues:
# Cek Dqlite cluster status
incus admin cluster show-log | tail -50
# Jika database corrupt, backup dulu
incus admin shutdown
cp -r /var/snap/incus/common/incus /backup/
# atau
cp -r /var/lib/incus /backup/ MicroOVN Debugging
OVN Connection Issues:
# Cek OVN services
snap services microovn
# Restart jika perlu
snap restart microovn
# Cek OVS socket
ls -la /run/openvswitch/
ls -la /var/snap/microovn/common/run/switch/
# Recreate symlink jika hilang
ln -sf /var/snap/microovn/common/run/switch/db.sock /run/openvswitch/db.sock OVN Network Debugging:
# Northbound DB - logical switches & routers
sudo ovn-nbctl show
# Southbound DB - chassis & bindings
sudo ovn-sbctl show
# List all logical switches
sudo ovn-nbctl ls-list
# List ports on switch
sudo ovn-nbctl lsp-list incus-net-ovn1
# Check port bindings
sudo ovn-sbctl list port_binding
# Find gateway chassis
sudo ovn-sbctl list port_binding cr-incus-net-ovn1-lr-lrp-ext | grep chassis
sudo ovn-sbctl list chassis VM Network Not Working:
# Problem: VM tidak dapat IP atau tidak bisa akses internet
# 1. Cek OVN logical port
sudo ovn-nbctl lsp-list incus-net-ovn1 | grep <vm-name>
# 2. Cek binding ke chassis
sudo ovn-sbctl find port_binding logical_port=incus-net-ovn1-<vm>-eth0
# 3. Cek OVS flows
sudo ovs-ofctl dump-flows br-int | head -50
# 4. Restart OVN controller di node VM
sudo snap restart microovn.ovn-controller
# 5. Jika masih gagal, restart incus
sudo systemctl restart incus Bridge Conflict setelah Reboot:
# Problem: "Network ovn1 unavailable on this server"
# Cause: OVS mengambil alih bridge name
# 1. Hapus OVS bridge yang conflict
ovs-vsctl del-br incusovn1
# 2. Buat Linux bridge
ip link add name incusovn1 type bridge
ip link set incusovn1 up
# 3. Buat veth pair
ip link add incusovn1a type veth peer name incusovn1b
ip link set incusovn1a master incusovn1
ip link set incusovn1a up
ip link set incusovn1b up
# 4. Restart Incus
systemctl restart incus MicroCeph Debugging
Ceph Health Issues:
# Status overall
sudo ceph status
sudo ceph health detail
# OSD issues
sudo ceph osd tree
sudo ceph osd df
# PG issues
sudo ceph pg stat
sudo ceph pg dump_stuck
# Recovery status
sudo ceph -w # watch mode OSD Down:
# Cek OSD status
sudo ceph osd tree
# Lihat log OSD yang down
journalctl -u snap.microceph.osd -n 100
# Manual start OSD
snap start microceph.osd
# Jika disk bermasalah
sudo ceph osd out osd.X
# Ganti disk
microceph disk add /dev/new-disk --wipe Storage Pool Full:
# Cek usage
sudo ceph df
incus storage info ceph
# Identify large volumes
sudo rbd ls incus
sudo rbd du incus
# Cleanup unused images
incus image list
incus image delete <fingerprint>
# Jika Ceph terlalu full untuk operasi
sudo ceph osd set-full-ratio 0.97 # temporary!
# Delete data
# Reset ratio
sudo ceph osd set-full-ratio 0.95 Instance Issues
VM Tidak Bisa Start:
# Cek error
incus start my-vm --console
# Cek log
incus info my-vm --show-log
# Network unavailable error
# -> Cek OVN network dan bridge (lihat di atas)
# Storage error
# -> Cek Ceph health
sudo ceph health detail Instance Stuck:
# Force stop
incus stop my-vm --force
# Jika masih stuck, cek QEMU process (untuk VM)
ps aux | grep qemu | grep my-vm
kill -9 <pid>
# Cleanup state
incus delete my-vm --force Part 8: Maintenance Procedures
Rolling Upgrade Procedure
Upgrade cluster satu node pada satu waktu untuk minimize downtime:
# 1. Evacuate node yang akan di-upgrade
incus cluster evacuate node-01
# 2. Verify instances pindah
incus list
# 3. Update di node-01
ssh node-01
snap refresh incus
snap refresh microceph
snap refresh microovn
reboot
# 4. Verify node kembali online
incus cluster list
# 5. Restore instances ke node
incus cluster restore node-01
# 6. Ulangi untuk node lainnya Adding New Node
# Di existing cluster node
incus cluster add new-node
# Copy token
microceph cluster add new-node
# Copy token
microovn cluster add new-node
# Copy token
# Di new node:
# 1. Install semua snaps
snap install incus microceph microovn
# 2. Join clusters
microceph cluster join <token>
microovn cluster join <token>
incus admin init # pilih join existing cluster
# 3. Setup bridges/symlinks
# ... (lihat Part 5)
# 4. Add Ceph OSDs
microceph disk add /dev/sdX --wipe Removing Node
# 1. Evacuate instances
incus cluster evacuate node-to-remove
# 2. Mark Ceph OSDs out
sudo ceph osd out osd.X osd.Y # all OSDs on that node
# 3. Wait for rebalancing
sudo ceph -w
# Wait until HEALTH_OK
# 4. Remove from clusters
incus cluster remove node-to-remove
microovn cluster remove node-to-remove
microceph cluster remove node-to-remove Backup Procedures
Cluster Configuration Backup:
#!/bin/bash
# /usr/local/bin/backup-cluster-config.sh
BACKUP_DIR="/backup/cluster-config/$(date +%Y-%m-%d)"
mkdir -p "$BACKUP_DIR"
# Incus config
incus config show > "$BACKUP_DIR/incus-config.yaml"
incus profile list -f yaml > "$BACKUP_DIR/profiles.yaml"
incus network list -f yaml > "$BACKUP_DIR/networks.yaml"
incus storage list -f yaml > "$BACKUP_DIR/storage-pools.yaml"
incus cluster list -f yaml > "$BACKUP_DIR/cluster.yaml"
# Export all instance configs
for instance in $(incus list -c n --format=csv); do
incus config show $instance > "$BACKUP_DIR/instance-$instance.yaml"
done
# Ceph config
sudo ceph config dump > "$BACKUP_DIR/ceph-config.txt"
sudo ceph osd dump > "$BACKUP_DIR/ceph-osd-dump.txt"
# OVN config
sudo ovn-nbctl show > "$BACKUP_DIR/ovn-nb.txt"
sudo ovn-sbctl show > "$BACKUP_DIR/ovn-sb.txt"
echo "Backup completed: $BACKUP_DIR" Instance Backup to Remote:
#!/bin/bash
# Backup semua instances ke remote storage
BACKUP_SERVER="backup.example.com"
BACKUP_PATH="/backups/incus"
for instance in $(incus list status=running -c n --format=csv); do
echo "Backing up $instance..."
# Create snapshot
incus snapshot $instance backup-snap
# Export dan transfer
incus export $instance --optimized-storage - |
ssh $BACKUP_SERVER "cat > $BACKUP_PATH/$instance-$(date +%Y%m%d).tar.gz"
# Cleanup snapshot
incus delete $instance/backup-snap
echo "$instance done"
done Monitoring Setup
Prometheus Metrics:
# Enable metrics di Incus
incus config set core.metrics_address=:9100
# Enable metrics di setiap node
# (dilakukan otomatis di cluster) Prometheus scrape config:
scrape_configs:
- job_name: 'incus'
static_configs:
- targets:
- '10.0.0.253:9100'
- '10.0.0.252:9100'
- '10.0.0.250:9100'
- job_name: 'ceph'
static_configs:
- targets:
- '10.0.0.253:9283' # ceph-exporter Alert Rules:
groups:
- name: incus
rules:
- alert: IncusNodeDown
expr: up{job="incus"} == 0
for: 2m
labels:
severity: critical
- alert: CephHealthWarning
expr: ceph_health_status == 1
for: 5m
labels:
severity: warning
- alert: CephHealthCritical
expr: ceph_health_status == 2
for: 1m
labels:
severity: critical
- alert: StoragePoolNearlyFull
expr: (ceph_pool_bytes_used / ceph_pool_max_avail) > 0.8
for: 10m
labels:
severity: warning Health Check Script
#!/bin/bash
# /usr/local/bin/cluster-healthcheck.sh
ALERT_EMAIL="[email protected]"
check_incus() {
echo "=== Incus Cluster ==="
incus cluster list
echo
OFFLINE=$(incus cluster list -f csv | grep -v "ONLINE" | wc -l)
if [ $OFFLINE -gt 0 ]; then
echo "ALERT: $OFFLINE nodes offline!"
return 1
fi
return 0
}
check_ceph() {
echo "=== Ceph Status ==="
HEALTH=$(sudo ceph health)
echo "$HEALTH"
echo
if [[ "$HEALTH" != "HEALTH_OK" ]]; then
echo "ALERT: Ceph not healthy!"
sudo ceph health detail
return 1
fi
return 0
}
check_ovn() {
echo "=== OVN Status ==="
microovn status
echo
CHASSIS=$(sudo ovn-sbctl list chassis | grep -c hostname)
EXPECTED=3 # adjust based on cluster size
if [ $CHASSIS -lt $EXPECTED ]; then
echo "ALERT: Only $CHASSIS of $EXPECTED OVN chassis online!"
return 1
fi
return 0
}
check_instances() {
echo "=== Instance Status ==="
TOTAL=$(incus list -c n --format=csv | wc -l)
RUNNING=$(incus list status=running -c n --format=csv | wc -l)
echo "Running: $RUNNING / $TOTAL"
echo
# Cek autostart instances yang stopped
for instance in $(incus list status=stopped -c n --format=csv); do
AUTOSTART=$(incus config get $instance boot.autostart)
if [ "$AUTOSTART" = "true" ]; then
echo "ALERT: $instance has autostart but is stopped!"
incus start $instance
fi
done
}
# Run all checks
FAILED=0
check_incus || FAILED=1
check_ceph || FAILED=1
check_ovn || FAILED=1
check_instances
echo
echo "=== Health Check Complete ==="
date
if [ $FAILED -eq 1 ]; then
# Send alert
echo "Cluster health check failed" | mail -s "ALERT: Cluster Issue" $ALERT_EMAIL
exit 1
fi
exit 0 Jadwalkan:
# Crontab
*/5 * * * * /usr/local/bin/cluster-healthcheck.sh >> /var/log/cluster-health.log 2>&1 Part 9: Production Checklist
Before Going Live
- Hardware: Semua nodes tested, RAID untuk OS disk
- Network: Redundant network paths, MTU optimized
- Ceph: Minimum 3 nodes, replication factor 3
- OVN: Gateway HA configured
- Backup: Automated backup tested dan verified
- Monitoring: Prometheus + Grafana dashboards
- Alerting: PagerDuty/Slack integration
- Documentation: Runbooks untuk common issues
- DR: Disaster recovery procedure documented
Security Checklist
- Firewall rules between nodes (only required ports)
- Certificate-based auth untuk Incus API
- No password auth untuk SSH
- Regular security updates scheduled
- Audit logging enabled
Capacity Planning
# Current capacity
incus cluster list --format=csv | while read line; do
NODE=$(echo $line | cut -d, -f1)
echo "=== $NODE ==="
incus info $NODE: --resources | grep -A5 "CPU:" | head -6
incus info $NODE: --resources | grep -A3 "Memory:"
done
# Ceph capacity
sudo ceph df
# Projected growth
# - Track instance creation rate
# - Monitor storage growth rate
# - Plan for 20% buffer Quick Reference Card
Incus Commands
# Cluster
incus cluster list
incus cluster add <node>
incus cluster evacuate <node>
incus cluster restore <node>
incus cluster remove <node> --force
# Instances
incus launch images:ubuntu/24.04 <name> --vm --target <node>
incus move <instance> --target <node>
incus delete <instance> --force
# Storage
incus storage list
incus storage info ceph
# Network
incus network list
incus network info ovn1 Ceph Commands
sudo ceph status
sudo ceph health detail
sudo ceph osd tree
sudo ceph osd df
sudo ceph df
sudo ceph -w OVN Commands
sudo ovn-nbctl show
sudo ovn-sbctl show
sudo ovn-sbctl list chassis
sudo ovn-sbctl list port_binding
sudo ovs-vsctl show MicroOVN/MicroCeph
microovn status
microovn cluster list
microceph status
microceph disk list Penutup
Dengan setup ini, kamu sudah punya infrastructure production-grade untuk VPS hosting:
- High Availability - 3+ nodes dengan automatic failover
- Distributed Storage - Ceph dengan replication
- Software-Defined Network - OVN untuk isolation dan scalability
- Live Migration - Pindah VM tanpa downtime
- Monitoring - Full observability dengan Prometheus
Key takeaways:
- Selalu minimal 3 nodes untuk quorum Ceph dan Dqlite
- Test failover sebelum production
- Automate backup dan test restore secara berkala
- Monitor proactively - jangan tunggu customer komplain
Sudah siap serve customer? Jika kamu ingin fokus ke bisnis tanpa urus infrastructure, pertimbangkan menjadi reseller Dalang.io - kami handle infrastructure, kamu handle customer!
Ada pertanyaan teknis? Hubungi [email protected]
