Some checks failed
Integration Tests / integration-tests (push) Failing after 20s
Integration Tests / performance-tests (push) Has been skipped
Service Adapters (Python FastAPI) / test (3.11) (push) Failing after 23s
Frontend (React) / test (20) (push) Failing after 1m3s
Frontend (React) / build (push) Has been skipped
Frontend (React) / lighthouse (push) Has been skipped
Service Adapters (Python FastAPI) / test (3.12) (push) Failing after 23s
Service Adapters (Python FastAPI) / test (3.13) (push) Failing after 20s
Service Adapters (Python FastAPI) / build (push) Has been skipped
### Summary of Changes - Removed proxy configuration in `rsbuild.config.js` as the API Gateway is not running. - Added smooth transitions and gentle loading overlays in CSS for improved user experience during data loading. - Updated `Dashboard` component to conditionally display loading spinner and gentle loading overlay based on data fetching state. - Enhanced `useOfflineAwareServiceStatus` and `useOfflineAwareSystemData` hooks to manage loading states and service status more effectively. - Increased refresh intervals for service status and system data to reduce API call frequency. ### Expected Results - Improved user experience with smoother loading transitions and better feedback during data refreshes. - Enhanced handling of service status checks, providing clearer information when services are unavailable. - Streamlined code for managing loading states, making it easier to maintain and extend in the future.
7.6 KiB
7.6 KiB
Health Checking System
This document describes the generalized health checking system for LabFusion Service Adapters.
Overview
The health checking system is designed to be flexible and extensible, supporting different types of health checks for different services. It uses a strategy pattern with pluggable health checkers.
Architecture
Core Components
- BaseHealthChecker: Abstract base class for all health checkers
- HealthCheckResult: Standardized result object
- HealthCheckerRegistry: Registry for different checker types
- HealthCheckerFactory: Factory for creating checker instances
- ServiceStatusChecker: Main orchestrator
Health Checker Types
1. API Health Checker (APIHealthChecker)
- Purpose: Check services with HTTP health endpoints
- Use Case: Most REST APIs, microservices
- Configuration:
{ "health_check_type": "api", "health_endpoint": "/api/health", "url": "https://service.example.com" }
2. Sensor Health Checker (SensorHealthChecker)
- Purpose: Check services via sensor data (e.g., Home Assistant entities)
- Use Case: Home Assistant, IoT devices, sensor-based monitoring
- Configuration:
{ "health_check_type": "sensor", "sensor_entity": "sensor.system_uptime", "url": "https://homeassistant.example.com" }
3. Custom Health Checker (CustomHealthChecker)
- Purpose: Complex health checks with multiple validation steps
- Use Case: Services requiring multiple checks, custom logic
- Configuration:
{ "health_check_type": "custom", "health_checks": [ { "type": "api", "name": "main_api", "url": "https://service.example.com/api/health" }, { "type": "sensor", "name": "uptime_sensor", "sensor_entity": "sensor.service_uptime" } ] }
Configuration
Service Configuration Structure
SERVICES = {
"service_name": {
"url": "https://service.example.com",
"enabled": True,
"health_check_type": "api|sensor|custom",
# API-specific
"health_endpoint": "/api/health",
"token": "auth_token",
"api_key": "api_key",
# Sensor-specific
"sensor_entity": "sensor.entity_name",
# Custom-specific
"health_checks": [
{
"type": "api",
"name": "check_name",
"url": "https://endpoint.com/health"
}
]
}
}
Environment Variables
# Service URLs
HOME_ASSISTANT_URL=https://ha.example.com
FRIGATE_URL=http://frigate.local:5000
IMMICH_URL=http://immich.local:2283
N8N_URL=http://n8n.local:5678
# Authentication
HOME_ASSISTANT_TOKEN=your_token
FRIGATE_TOKEN=your_token
IMMICH_API_KEY=your_key
N8N_API_KEY=your_key
Usage Examples
Basic API Health Check
from services.health_checkers import factory
# Create API checker
checker = factory.create_checker("api", timeout=5.0)
# Check service
config = {
"url": "https://api.example.com",
"health_endpoint": "/health",
"enabled": True
}
result = await checker.check_health("example_service", config)
print(f"Status: {result.status}")
print(f"Response time: {result.response_time}s")
Sensor-Based Health Check
# Create sensor checker
checker = factory.create_checker("sensor", timeout=5.0)
# Check Home Assistant sensor
config = {
"url": "https://ha.example.com",
"sensor_entity": "sensor.system_uptime",
"token": "your_token",
"enabled": True
}
result = await checker.check_health("home_assistant", config)
print(f"Uptime: {result.metadata.get('sensor_state')}")
Custom Health Check
# Create custom checker
checker = factory.create_checker("custom", timeout=10.0)
# Check with multiple validations
config = {
"url": "https://service.example.com",
"enabled": True,
"health_checks": [
{
"type": "api",
"name": "main_api",
"url": "https://service.example.com/api/health"
},
{
"type": "api",
"name": "database",
"url": "https://service.example.com/api/db/health"
}
]
}
result = await checker.check_health("complex_service", config)
print(f"Overall status: {result.status}")
print(f"Individual checks: {result.metadata.get('check_results')}")
Health Check Results
HealthCheckResult Structure
{
"status": "healthy|unhealthy|disabled|error|timeout|unauthorized|forbidden",
"response_time": 0.123, # seconds
"error": "Error message if applicable",
"metadata": {
"http_status": 200,
"response_size": 1024,
"sensor_state": "12345",
"last_updated": "2024-01-15T10:30:00Z"
}
}
Status Values
- healthy: Service is responding normally
- unhealthy: Service responded but with error status
- disabled: Service is disabled in configuration
- timeout: Request timed out
- unauthorized: Authentication required (HTTP 401)
- forbidden: Access forbidden (HTTP 403)
- error: Network or other error occurred
Extending the System
Adding a New Health Checker
-
Create the checker class:
from .base import BaseHealthChecker, HealthCheckResult class MyCustomChecker(BaseHealthChecker): async def check_health(self, service_name: str, config: Dict) -> HealthCheckResult: # Implementation pass -
Register the checker:
from services.health_checkers import registry registry.register("my_custom", MyCustomChecker) -
Use in configuration:
{ "health_check_type": "my_custom", "custom_param": "value" }
Service-Specific Logic
The factory automatically selects the appropriate checker based on:
health_check_typein configuration- Service name patterns
- Configuration presence (e.g.,
sensor_entity→ sensor checker)
Performance Considerations
- Concurrent Checking: All services are checked simultaneously
- Checker Caching: Checkers are cached per service to avoid recreation
- Timeout Management: Configurable timeouts per checker type
- Resource Cleanup: Proper cleanup of HTTP clients
Monitoring and Logging
- Debug Logs: Detailed operation logs for troubleshooting
- Performance Metrics: Response times and success rates
- Error Tracking: Comprehensive error logging with context
- Health Summary: Overall system health statistics
Best Practices
- Choose Appropriate Checker: Use the right checker type for your service
- Set Reasonable Timeouts: Balance responsiveness with reliability
- Handle Errors Gracefully: Always provide meaningful error messages
- Monitor Performance: Track response times and success rates
- Test Thoroughly: Verify health checks work in all scenarios
- Document Configuration: Keep service configurations well-documented
Troubleshooting
Common Issues
- Timeout Errors: Increase timeout or check network connectivity
- Authentication Failures: Verify tokens and API keys
- Sensor Not Found: Check entity names and permissions
- Configuration Errors: Validate service configuration structure
Debug Tools
- Debug Endpoint:
/debug/loggingto test logging configuration - Health Check Logs: Detailed logs for each health check operation
- Metadata Inspection: Check metadata for additional context