# Cache Troubleshooting Guide ## Problem Description The LabFusion CI/CD pipelines were experiencing cache timeout errors: ``` ::warning::Failed to restore: getCacheEntry failed: connect ETIMEDOUT 172.31.0.3:44029 ``` This error occurs when the cache service is not accessible from the job containers due to Docker networking issues. ## Root Cause The issue is caused by: 1. **Docker Networking**: Containers can't reach the cache server on the host 2. **Random Port Assignment**: Using port 0 causes unpredictable port assignments 3. **Cache Service Location**: The cache service binds to an IP that containers can't access ## Solutions Implemented ### 1. Workflow-Level Fixes Added `fail-on-cache-miss: false` to all cache actions in: - `.gitea/workflows/api-gateway.yml` - `.gitea/workflows/frontend.yml` - `.gitea/workflows/service-adapters.yml` - `.gitea/workflows/api-docs.yml` - `.gitea/workflows/ci.yml` This ensures that cache failures don't cause the entire pipeline to fail. ### 2. Runner Configuration Fixes Created `runners/config_cache_fixed.yaml` with: - **Fixed Host**: `host.docker.internal` (allows containers to access host) - **Fixed Port**: `44029` (instead of random port 0) - **Host Network**: Uses host networking for better connectivity ### 3. Troubleshooting Tools Created diagnostic scripts: - `runners/fix-cache-issues.sh` (Linux/macOS) - `runners/fix-cache-issues.ps1` (Windows) These scripts help diagnose and fix cache issues. ## How to Apply the Fixes ### Option 1: Use the Fixed Configuration 1. Stop your current runner: ```bash pkill -f act_runner ``` 2. Start with the fixed configuration: ```bash ./act_runner daemon --config config_cache_fixed.yaml ``` ### Option 2: Run the Troubleshooting Script **Linux/macOS:** ```bash cd runners ./fix-cache-issues.sh ``` **Windows:** ```powershell cd runners .\fix-cache-issues.ps1 ``` ### Option 3: Manual Configuration Update your runner configuration with these key changes: ```yaml cache: enabled: true host: "host.docker.internal" # Fixed host port: 44029 # Fixed port container: network: "host" # Use host networking ``` ## Verification After applying the fixes: 1. **Check Runner Logs**: Look for cache service startup messages 2. **Test a Workflow**: Run a simple workflow to verify cache works 3. **Monitor Cache Hits**: Check if dependencies are being cached properly ## Expected Results - ✅ No more `ETIMEDOUT` errors - ✅ Cache hits show "✅ Cache hit!" messages - ✅ Faster build times due to dependency caching - ✅ Workflows continue even if cache fails ## Troubleshooting If issues persist: 1. **Check Docker Networking**: ```bash docker network ls docker network inspect bridge ``` 2. **Verify Cache Service**: ```bash netstat -tlnp | grep 44029 ``` 3. **Test Connectivity**: ```bash curl http://host.docker.internal:44029/ ``` 4. **Check Runner Logs**: ```bash tail -f runner.log ``` ## Additional Resources - [Gitea Act Runner Documentation](https://gitea.com/gitea/act_runner/src/branch/main/docs/configuration.md) - [GitHub Actions Cache Documentation](https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows) - [Docker Networking Documentation](https://docs.docker.com/network/) ## Support If you continue to experience cache issues after applying these fixes, please: 1. Run the troubleshooting script and share the output 2. Check the runner logs for any error messages 3. Verify your Docker and network configuration