Upgrade Your Node
Cosmovisor
The Cosmos SDK provides a convenient process manager that wraps around the crossfid
binary and can automatically swap in new binaries upon a successful governance upgrade proposal. Cosmovisor is highly recommended for validator operations as it enables:
- Automatic upgrades: Seamlessly switches to new binaries during governance upgrades
- Zero-downtime upgrades: Minimizes validator downtime during network upgrades
- Backup management: Automatically backs up old binaries (configurable)
- Rollback capability: Can revert to previous versions if needed
- Integration with systemd: Works perfectly with system service managers for automatic restarts
For validator operations, Cosmovisor combined with systemd ensures MTTR (Mean Time To Recovery) < 2 minutes, which is critical for maintaining high validator uptime and avoiding slashing penalties.
More information can be found in cosmos.network docs and cosmos-sdk/cosmovisor/readme.
Setup
Install Cosmovisor
- Install the latest version of Cosmovisor:
# Install from source
go install cosmossdk.io/tools/cosmovisor/cmd/cosmovisor@latest
# Verify installation
cosmovisor version - Note: Ensure you have Go 1.19+ installed for the latest version.
Set Environment Variables
- Configure Cosmovisor environment variables:
# Add to your shell profile (~/.bashrc, ~/.zshrc, or ~/.profile)
echo "# Setup Cosmovisor" >> ~/.profile
echo "export DAEMON_NAME=crossfid" >> ~/.profile
echo "export DAEMON_HOME=$HOME/.crossfi" >> ~/.profile
echo "export DAEMON_RESTART_AFTER_UPGRADE=true" >> ~/.profile
echo "export DAEMON_ALLOW_DOWNLOAD_BINARIES=false" >> ~/.profile
echo "export UNSAFE_SKIP_BACKUP=false" >> ~/.profile
# Apply changes
source ~/.profile - Environment Variables Explained:
DAEMON_RESTART_AFTER_UPGRADE=true
: Automatically restart after upgradesDAEMON_ALLOW_DOWNLOAD_BINARIES=false
: Disable automatic binary downloads (recommended for security)UNSAFE_SKIP_BACKUP=false
: Keep backups of old binaries
Create Directory Structure
- Set up the required Cosmovisor directory structure:
# Create directories
mkdir -p $DAEMON_HOME/cosmovisor/genesis/bin
mkdir -p $DAEMON_HOME/cosmovisor/upgrades
# Copy current binary
cp $(which crossfid) $DAEMON_HOME/cosmovisor/genesis/bin/
# Verify setup
cosmovisor version
ls -la $DAEMON_HOME/cosmovisor/ - The directory structure should look like:
~/.crossfi/cosmovisor/
├── current -> genesis or upgrades/<name>
├── genesis
│ └── bin
│ └── crossfid
└── upgrades
└── <upgrade_name>
└── bin
└── crossfid
Test Cosmovisor
- Test that Cosmovisor can run your node:
# Test cosmovisor (should show same version as crossfid)
cosmovisor version
# Start node with cosmovisor (for testing)
cosmovisor run start --help - ⚠️ Important: Don't run
cosmovisor run start
yet if you plan to use systemd - we'll configure that next.
Preparing an Upgrade
Cosmovisor will continually poll the $DAEMON_HOME/data/upgrade-info.json
for new upgrade instructions. When an upgrade is ready, node operators can download the new binary and place it under $DAEMON_HOME/cosmovisor/upgrades//bin
where `` is the URI-encoded name of the upgrade as specified in the upgrade module plan.
It is possible to have Cosmovisor automatically download the new binary. To do this set the following environment variable.
export DAEMON_ALLOW_DOWNLOAD_BINARIES=true
Cosmovisor with systemd
For production validator operations, it's essential to run Cosmovisor as a system service to ensure automatic restarts and maintain high availability.
Why systemd + Cosmovisor?
🔄 Automatic Recovery
- Restarts node immediately after crashes
- Maintains MTTR < 2 minutes
- Reduces slashing risk from downtime
- Works during system reboots
⚙️ Production Benefits
- Professional service management
- Integrated logging with journald
- Resource limit enforcement
- Dependency management
systemd Service Setup
Create Service File
- Create a systemd service file for your CrossFi validator:
sudo nano /etc/systemd/system/crossfid.service
- Add the following configuration (replace
<username>
with your actual username):[Unit]
Description=CrossFi Daemon (Cosmovisor)
After=network-online.target
Wants=network-online.target
[Service]
Type=exec
User=<username>
Group=<username>
ExecStart=/home/<username>/go/bin/cosmovisor run start
Restart=always
RestartSec=3
LimitNOFILE=4096
Environment="DAEMON_NAME=crossfid"
Environment="DAEMON_HOME=/home/<username>/.crossfi"
Environment="DAEMON_ALLOW_DOWNLOAD_BINARIES=false"
Environment="DAEMON_RESTART_AFTER_UPGRADE=true"
Environment="DAEMON_LOG_BUFFER_SIZE=512"
Environment="UNSAFE_SKIP_BACKUP=false"
[Install]
WantedBy=multi-user.target
Configure Service
- Key Configuration Options:
Restart=always
: Always restart the service if it stopsRestartSec=3
: Wait 3 seconds before restartingLimitNOFILE=4096
: Set file descriptor limitAfter=network-online.target
: Wait for network connectivity
- Environment Variables:
DAEMON_ALLOW_DOWNLOAD_BINARIES=false
: Security best practiceDAEMON_RESTART_AFTER_UPGRADE=true
: Auto-restart after upgradesDAEMON_LOG_BUFFER_SIZE=512
: Optimize logging performance
Enable and Start Service
- Enable and start the CrossFi service:
# Reload systemd configuration
sudo systemctl daemon-reload
# Enable service to start on boot
sudo systemctl enable crossfid.service
# Start the service
sudo systemctl start crossfid.service
# Check service status
sudo systemctl status crossfid.service
Monitor and Manage
- Essential monitoring commands:
# View live logs
sudo journalctl -u crossfid.service -f
# View recent logs
sudo journalctl -u crossfid.service --since "1 hour ago"
# Check service status
sudo systemctl status crossfid.service
# Restart service (if needed)
sudo systemctl restart crossfid.service
# Stop service
sudo systemctl stop crossfid.service
Testing Automatic Restart
Test Process Recovery
- Kill the crossfid process to test automatic restart:
# Find the crossfid process ID
ps aux | grep crossfid
# Kill the process (replace PID with actual process ID)
sudo kill -9 <PID>
# Check that systemd restarts it within seconds
sudo systemctl status crossfid.service - Expected behavior: The service should restart within 3 seconds.
Monitor Recovery Time
- Monitor logs to confirm quick recovery:
# Watch logs in real-time during the test
sudo journalctl -u crossfid.service -f - You should see:
- Process termination logged
- Automatic restart initiated
- Node resuming operations
- Total downtime < 10 seconds
Test Reboot Persistence
- Verify the service starts automatically after system reboot:
# Check if service is enabled
sudo systemctl is-enabled crossfid.service
# Should return: enabled - After a system reboot, verify the service starts automatically:
sudo systemctl status crossfid.service
Advanced Configuration
Resource Limits
For high-performance validators, consider additional resource limits:
[Service]
# Memory limits (adjust based on your system)
MemoryMax=8G
MemoryHigh=6G
# CPU limits (optional)
CPUQuota=200%
# I/O limits (optional)
IOWeight=500
# Process limits
LimitNPROC=32768
LimitNOFILE=65536
Logging Configuration
Configure advanced logging options:
[Service]
# Log to specific file (optional)
StandardOutput=journal+console
StandardError=journal+console
# Additional environment variables
Environment="DAEMON_LOG_BUFFER_SIZE=1024"
Environment="DAEMON_LOG_LEVEL=info"
Security Hardening
For enhanced security, add these options:
[Service]
# Run with restricted privileges
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/home/<username>/.crossfi
# Network restrictions
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6
Troubleshooting
- 🚫 Service Won't Start
- ⏱️ Slow Restart Times
- 📊 Log Management
Common issues and solutions:
- Permission issues: Ensure the user has access to all paths
- Binary not found: Verify cosmovisor and crossfid are in PATH
- Environment variables: Check all required variables are set
# Debug service startup sudo systemctl status crossfid.service -l sudo journalctl -u crossfid.service --since "10 minutes ago"
If restart times exceed 10 seconds:
- Reduce
RestartSec
to 1 second (minimum recommended) - Check system I/O performance
- Ensure sufficient RAM available
- Monitor CPU usage during restart
# Monitor system resources htop iostat -x 1 free -h
Manage log storage and rotation:
# Limit journal size (add to /etc/systemd/journald.conf) SystemMaxUse=1G SystemMaxFileSize=100M SystemMaxFiles=10 # Apply changes sudo systemctl restart systemd-journald # Clean old logs sudo journalctl --vacuum-time=7d
For mission-critical validators, consider setting up monitoring alerts that notify you if the service restarts frequently, indicating potential underlying issues.
Manual Software Upgrade
First, stop your instance of `