admin/labFusion

Fork 0

Files

GSRN 79250ea3ab

Docker Build and Push / build-and-push (push) Failing after 31s

Details

API Docs (Node.js Express) / test (20) (push) Successful in 3m56s

Details

API Docs (Node.js Express) / test (16) (push) Successful in 4m4s

Details

API Docs (Node.js Express) / test (18) (push) Successful in 4m10s

Details

LabFusion CI/CD Pipeline / api-gateway (push) Failing after 1m22s

Details

LabFusion CI/CD Pipeline / api-docs (push) Successful in 1m2s

Details

API Gateway (Java Spring Boot) / test (17) (push) Failing after 2m39s

Details

API Gateway (Java Spring Boot) / test (21) (push) Failing after 2m45s

Details

API Gateway (Java Spring Boot) / build (push) Has been skipped

Details

API Gateway (Java Spring Boot) / security (push) Has been skipped

Details

LabFusion CI/CD Pipeline / service-adapters (push) Failing after 3m21s

Details

Frontend (React) / test (16) (push) Failing after 1m46s

Details

LabFusion CI/CD Pipeline / frontend (push) Failing after 1m59s

Details

LabFusion CI/CD Pipeline / integration-tests (push) Has been skipped

Details

Frontend (React) / test (18) (push) Failing after 1m50s

Details

Integration Tests / integration-tests (push) Failing after 49s

Details

Integration Tests / performance-tests (push) Has been skipped

Details

Service Adapters (Python FastAPI) / test (3.1) (push) Failing after 1m7s

Details

Frontend (React) / test (20) (push) Failing after 2m30s

Details

Frontend (React) / build (push) Has been skipped

Details

Service Adapters (Python FastAPI) / test (3.11) (push) Failing after 1m43s

Details

Frontend (React) / lighthouse (push) Has been skipped

Details

Service Adapters (Python FastAPI) / test (3.9) (push) Failing after 1m2s

Details

Service Adapters (Python FastAPI) / test (3.12) (push) Failing after 1m43s

Details

Service Adapters (Python FastAPI) / build (push) Has been skipped

Details

API Docs (Node.js Express) / build (push) Successful in 59s

Details

refactor: Apply cache fixes directly to existing runner configs

- Update all runner configuration files with cache networking fixes:
  - config_docker.yaml
  - config_heavy.yaml
  - config_light.yaml
  - config_security.yaml
- Remove separate config_cache_fixed.yaml file
- Update troubleshooting scripts to use updated configs
- Update documentation to reference existing config files

All runner configs now have:
- Fixed cache host: host.docker.internal
- Fixed cache port: 44029
- Host networking for better container connectivity

This provides a cleaner approach by updating existing configs
instead of maintaining a separate fixed configuration file.

2025-09-15 16:44:16 +02:00

3.8 KiB

Raw Blame History

Cache Troubleshooting Guide

Problem Description

The LabFusion CI/CD pipelines were experiencing cache timeout errors:

::warning::Failed to restore: getCacheEntry failed: connect ETIMEDOUT 172.31.0.3:44029

This error occurs when the cache service is not accessible from the job containers due to Docker networking issues.

Root Cause

The issue is caused by:

Docker Networking: Containers can't reach the cache server on the host
Random Port Assignment: Using port 0 causes unpredictable port assignments
Cache Service Location: The cache service binds to an IP that containers can't access

Solutions Implemented

1. Workflow-Level Fixes

Added fail-on-cache-miss: false to all cache actions in:

.gitea/workflows/api-gateway.yml
.gitea/workflows/frontend.yml
.gitea/workflows/service-adapters.yml
.gitea/workflows/api-docs.yml
.gitea/workflows/ci.yml

This ensures that cache failures don't cause the entire pipeline to fail.

2. Runner Configuration Fixes

Updated all existing runner configuration files with:

Fixed Host: host.docker.internal (allows containers to access host)
Fixed Port: 44029 (instead of random port 0)
Host Network: Uses host networking for better connectivity

Updated files:

runners/config_docker.yaml
runners/config_heavy.yaml
runners/config_light.yaml
runners/config_security.yaml

3. Troubleshooting Tools

Created diagnostic scripts:

runners/fix-cache-issues.sh (Linux/macOS)
runners/fix-cache-issues.ps1 (Windows)

These scripts help diagnose and fix cache issues.

How to Apply the Fixes

Option 1: Use the Updated Configuration

Stop your current runner:
```
pkill -f act_runner
```

Start with an updated configuration:

./act_runner daemon --config config_docker.yaml
# or
./act_runner daemon --config config_heavy.yaml
# or
./act_runner daemon --config config_light.yaml
# or
./act_runner daemon --config config_security.yaml

Option 2: Run the Troubleshooting Script

Linux/macOS:

cd runners
./fix-cache-issues.sh

Windows:

cd runners
.\fix-cache-issues.ps1

Option 3: Manual Configuration

Update your runner configuration with these key changes:

cache:
  enabled: true
  host: "host.docker.internal"  # Fixed host
  port: 44029                   # Fixed port

container:
  network: "host"               # Use host networking

Verification

After applying the fixes:

Check Runner Logs: Look for cache service startup messages
Test a Workflow: Run a simple workflow to verify cache works
Monitor Cache Hits: Check if dependencies are being cached properly

Expected Results

✅ No more ETIMEDOUT errors
✅ Cache hits show "✅ Cache hit!" messages
✅ Faster build times due to dependency caching
✅ Workflows continue even if cache fails

Troubleshooting

If issues persist:

Check Docker Networking:

docker network ls
docker network inspect bridge

Verify Cache Service:
```
netstat -tlnp | grep 44029
```

Test Connectivity:

curl http://host.docker.internal:44029/

Check Runner Logs:
```
tail -f runner.log
```

Additional Resources

Support

If you continue to experience cache issues after applying these fixes, please:

Run the troubleshooting script and share the output
Check the runner logs for any error messages
Verify your Docker and network configuration

3.8 KiB Raw Blame History

Cache Troubleshooting Guide

Problem Description

Root Cause

Solutions Implemented

1. Workflow-Level Fixes

2. Runner Configuration Fixes

3. Troubleshooting Tools

How to Apply the Fixes

Option 1: Use the Updated Configuration

Option 2: Run the Troubleshooting Script

Option 3: Manual Configuration

Verification

Expected Results

Troubleshooting

Additional Resources

Support

3.8 KiB

Raw Blame History