Quick health check command
echo "=== UNBIND SYSTEM HEALTH CHECK ===" && \
echo "Pod Status:" && kubectl get pods -n unbind-system -l 'app in (unbind-ui,unbind-api,unbind-auth)' && \
echo -e "\nNode Status:" && kubectl get nodes && \
echo -e "\nRecent Events:" && kubectl get events -n unbind-system --sort-by='.lastTimestamp' | tail -10
3) Unbind application logs
kubectl logs -n unbind-system --tail 100 -l 'app in (unbind-ui,unbind-api,unbind-auth,unbind-operator,dex)'Copy and paste this into your terminal to check the health of your servers and unbind related workloads.
Troubleshooting
This guide covers common issues you might encounter with Unbind/Kubernetes and how to resolve them.
General Diagnostic Commands
Before diving into specific issues, these commands help gather information about your system:
System Status
# Check overall cluster health
sudo kubectl get nodes
sudo kubectl get pods --all-namespaces
# Get logs from a pod
sudo kubectl logs <pod-name> -n <namespace>
# Check Unbind specific services
sudo kubectl get pods -n unbind-system
sudo kubectl get services -n unbind-systemResource Usage
# Check resource consumption
sudo kubectl top nodes # By server
sudo kubectl top pods --all-namespaces # By workload
# Check storage usage
df -hLogs
# Unbind application logs
sudo kubectl logs deployment/unbind-api-deployment -n unbind-system --follow
sudo kubectl logs deployment/unbind-ui-deployment -n unbind-system --follow
# K3S system logs
sudo journalctl -u k3s -f
# System logs
sudo dmesg | tail -20Common Issues
1. Cluster Access Issues
Symptoms:
- Can’t connect to the cluster with kubectl
- “connection refused” errors
- Certificate or authentication errors
Connection Refused
If you get “connection refused” errors:
# Check if K3S is running
sudo systemctl status k3s
# Check if the API server is listening
sudo netstat -tlnp | grep 6443
# Check firewall rules
sudo ufw status
sudo iptables -L | grep 6443Certificate Issues
If you encounter certificate errors:
# Check certificate validity
openssl s_client -connect YOUR_SERVER_IP:6443
# Skip certificate verification (testing only)
kubectl --insecure-skip-tls-verify get nodesPermission Denied
If you are having issues accessing the cluster remotely, the credentials may have been rotated. You can refresh the kubeconfig by following the cluster access guide.
If you get permission errors:
# Check kubeconfig file permissions
ls -la ~/.kube/config
# Check if you're using the right kubeconfig
kubectl config view
# Check server connectivity
kubectl cluster-infoAuthentication Issues
# Check if credentials are valid
kubectl auth can-i get pods
# View current user info
kubectl config view --minify
# Check for expired certificates
kubectl get csr2. Installation Failed
If the installation fails for any reason, you can attempt to run the installer again.
Diagnostic Steps:
# Check system requirements
cat /proc/cpuinfo | grep processor | wc -l # CPU count
free -h # Memory
df -h # Disk space
# Check network connectivity
curl -I https://unbind.app
ping 8.8.8.8Solutions:
- Verify system requirements are met
- Check internet connectivity
- Ensure ports 80, 443, 6443 are available
3. Can’t Access Unbind UI
Symptoms:
- Browser shows connection refused or timeout
- DNS resolution issues
Diagnostic Steps:
# Check if services are running
sudo kubectl get pods -l app=unbind-ui -n unbind-system
sudo kubectl get services -l app=unbind-ui -n unbind-systemSolutions:
- Verify DNS records point to your server
- Check firewall settings (allow ports 80, 443)
- Check SSL certificate status:
sudo kubectl get certificaterequest --all-namespaces
4. Applications Won’t Start
Symptoms:
- Pods stuck in Pending, CrashLoopBackOff, or Error states
- Applications show as unhealthy
Diagnostic Steps:
# Check pod status and events
sudo kubectl get pods --all-namespaces
sudo kubectl describe pod POD_NAME -n <namespace>
# Check resource constraints
sudo kubectl describe nodes
sudo kubectl get events --sort-by=.metadata.creationTimestampSolutions:
- Insufficient Resources: Add more nodes, rescale server, or set tighter resource limits for specific workloads.
- Image Pull Issues: Check internet connectivity and registry credentials
- Storage Issues: Verify persistent volume claims:
kubectl get pvcand longhorn pod statuskubectl get pods -n longhorn-system - Configuration Errors: Check application logs for errors.
5. Storage Issues
Symptoms:
- Pods can’t mount volumes
- “No space left on device” errors
- Longhorn volumes in degraded state
Diagnostic Steps:
# Check disk space
df -h
sudo kubectl get pv,pvc --all-namespaces
# Check Longhorn status
sudo kubectl get pods -n longhorn-systemSolutions:
- Disk Full: Clean up unused containers and volumes, you can do
sudo kubectl delete pvc <PVC_NAME> -n <namespace> - Longhorn Issues: Restart Longhorn:
sudo kubectl rollout restart daemonset/longhorn-manager -n longhorn-system - Permission Issues: Check node storage permissions
6. Networking Problems
Symptoms:
- Services can’t communicate
- External traffic not reaching applications
- DNS resolution failures
Diagnostic Steps:
# Check network connectivity between pods
sudo kubectl exec -it POD_NAME -n <namespace> -- ping SERVICE_NAME
# Check DNS resolution
sudo kubectl exec -it POD_NAME -n <namespace> -- nslookup kubernetes.default
# Check service endpoints
sudo kubectl get endpoints --all-namespacesSolutions:
- CNI Issues: Restart flannel:
sudo kubectl rollout restart daemonset/flannel -n kube-system - DNS Issues: Restart CoreDNS:
sudo kubectl rollout restart deployment/coredns -n kube-system - Firewall: Check iptables rules and security groups
- Reboot the server: If you’re having issues with networking, it’s often a good idea to reboot the server - this will flush iptables rules and reset the network stack.
7. Performance Issues
Symptoms:
- Slow application response times
- High resource usage
- Frequent pod restarts
Diagnostic Steps:
# Monitor resource usage
sudo kubectl top nodes
sudo kubectl top pods --all-namespaces
# Check for resource limits
sudo kubectl describe pods --all-namespaces | grep -A 5 -B 5 "Limits\|Requests"
# Monitor system load
htop
iostat 1 5Solutions:
- CPU/Memory: Increase resource limits or add nodes
- I/O Bottleneck: Use faster storage or optimize applications
- Network: Check for high network utilization
Advanced Troubleshooting
Cluster Recovery
If your cluster is in a bad state or a node is not responding:
# Restart K3S service
sudo systemctl restart k3sGetting Help
Collecting Debug Information
When seeking help, collect this information:
# System information
uname -a
sudo k3s --version
# Cluster state
sudo kubectl cluster-info dump > cluster-dump.yaml
# Application logs
sudo kubectl logs deployment/unbind-api-deployment -n unbind-system > unbind-api.log
sudo kubectl logs deployment/unbind-ui-deployment -n unbind-system > unbind-ui.logSupport Channels
- Discord Community: Join our Discord for real-time help
- GitHub Issues: Report bugs at GitHub
- Documentation: Check the Support page for more resources
When reporting issues, always include your system information, error messages, and relevant logs.
Prevention
Regular Maintenance
# Update system packages
sudo apt update && sudo apt upgrade # Ubuntu/Debian
sudo dnf update # Fedora/RHEL
sudo zypper update # OpenSUSE
# Clean up unused resources
sudo kubectl delete pods --field-selector=status.phase=Failed
sudo crictl rmi $(sudo crictl images -q) # Equivalent to docker system prune -a