feat: Enhance frontend loading experience and service status handling
Some checks failed
Integration Tests / integration-tests (push) Failing after 20s
Integration Tests / performance-tests (push) Has been skipped
Service Adapters (Python FastAPI) / test (3.11) (push) Failing after 23s
Frontend (React) / test (20) (push) Failing after 1m3s
Frontend (React) / build (push) Has been skipped
Frontend (React) / lighthouse (push) Has been skipped
Service Adapters (Python FastAPI) / test (3.12) (push) Failing after 23s
Service Adapters (Python FastAPI) / test (3.13) (push) Failing after 20s
Service Adapters (Python FastAPI) / build (push) Has been skipped
Some checks failed
Integration Tests / integration-tests (push) Failing after 20s
Integration Tests / performance-tests (push) Has been skipped
Service Adapters (Python FastAPI) / test (3.11) (push) Failing after 23s
Frontend (React) / test (20) (push) Failing after 1m3s
Frontend (React) / build (push) Has been skipped
Frontend (React) / lighthouse (push) Has been skipped
Service Adapters (Python FastAPI) / test (3.12) (push) Failing after 23s
Service Adapters (Python FastAPI) / test (3.13) (push) Failing after 20s
Service Adapters (Python FastAPI) / build (push) Has been skipped
### Summary of Changes - Removed proxy configuration in `rsbuild.config.js` as the API Gateway is not running. - Added smooth transitions and gentle loading overlays in CSS for improved user experience during data loading. - Updated `Dashboard` component to conditionally display loading spinner and gentle loading overlay based on data fetching state. - Enhanced `useOfflineAwareServiceStatus` and `useOfflineAwareSystemData` hooks to manage loading states and service status more effectively. - Increased refresh intervals for service status and system data to reduce API call frequency. ### Expected Results - Improved user experience with smoother loading transitions and better feedback during data refreshes. - Enhanced handling of service status checks, providing clearer information when services are unavailable. - Streamlined code for managing loading states, making it easier to maintain and extend in the future.
This commit is contained in:
280
services/service-adapters/HEALTH_CHECKING.md
Normal file
280
services/service-adapters/HEALTH_CHECKING.md
Normal file
@@ -0,0 +1,280 @@
|
||||
# Health Checking System
|
||||
|
||||
This document describes the generalized health checking system for LabFusion Service Adapters.
|
||||
|
||||
## Overview
|
||||
|
||||
The health checking system is designed to be flexible and extensible, supporting different types of health checks for different services. It uses a strategy pattern with pluggable health checkers.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Core Components
|
||||
|
||||
1. **BaseHealthChecker**: Abstract base class for all health checkers
|
||||
2. **HealthCheckResult**: Standardized result object
|
||||
3. **HealthCheckerRegistry**: Registry for different checker types
|
||||
4. **HealthCheckerFactory**: Factory for creating checker instances
|
||||
5. **ServiceStatusChecker**: Main orchestrator
|
||||
|
||||
### Health Checker Types
|
||||
|
||||
#### 1. API Health Checker (`APIHealthChecker`)
|
||||
- **Purpose**: Check services with HTTP health endpoints
|
||||
- **Use Case**: Most REST APIs, microservices
|
||||
- **Configuration**:
|
||||
```python
|
||||
{
|
||||
"health_check_type": "api",
|
||||
"health_endpoint": "/api/health",
|
||||
"url": "https://service.example.com"
|
||||
}
|
||||
```
|
||||
|
||||
#### 2. Sensor Health Checker (`SensorHealthChecker`)
|
||||
- **Purpose**: Check services via sensor data (e.g., Home Assistant entities)
|
||||
- **Use Case**: Home Assistant, IoT devices, sensor-based monitoring
|
||||
- **Configuration**:
|
||||
```python
|
||||
{
|
||||
"health_check_type": "sensor",
|
||||
"sensor_entity": "sensor.system_uptime",
|
||||
"url": "https://homeassistant.example.com"
|
||||
}
|
||||
```
|
||||
|
||||
#### 3. Custom Health Checker (`CustomHealthChecker`)
|
||||
- **Purpose**: Complex health checks with multiple validation steps
|
||||
- **Use Case**: Services requiring multiple checks, custom logic
|
||||
- **Configuration**:
|
||||
```python
|
||||
{
|
||||
"health_check_type": "custom",
|
||||
"health_checks": [
|
||||
{
|
||||
"type": "api",
|
||||
"name": "main_api",
|
||||
"url": "https://service.example.com/api/health"
|
||||
},
|
||||
{
|
||||
"type": "sensor",
|
||||
"name": "uptime_sensor",
|
||||
"sensor_entity": "sensor.service_uptime"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Service Configuration Structure
|
||||
|
||||
```python
|
||||
SERVICES = {
|
||||
"service_name": {
|
||||
"url": "https://service.example.com",
|
||||
"enabled": True,
|
||||
"health_check_type": "api|sensor|custom",
|
||||
|
||||
# API-specific
|
||||
"health_endpoint": "/api/health",
|
||||
"token": "auth_token",
|
||||
"api_key": "api_key",
|
||||
|
||||
# Sensor-specific
|
||||
"sensor_entity": "sensor.entity_name",
|
||||
|
||||
# Custom-specific
|
||||
"health_checks": [
|
||||
{
|
||||
"type": "api",
|
||||
"name": "check_name",
|
||||
"url": "https://endpoint.com/health"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
|
||||
```bash
|
||||
# Service URLs
|
||||
HOME_ASSISTANT_URL=https://ha.example.com
|
||||
FRIGATE_URL=http://frigate.local:5000
|
||||
IMMICH_URL=http://immich.local:2283
|
||||
N8N_URL=http://n8n.local:5678
|
||||
|
||||
# Authentication
|
||||
HOME_ASSISTANT_TOKEN=your_token
|
||||
FRIGATE_TOKEN=your_token
|
||||
IMMICH_API_KEY=your_key
|
||||
N8N_API_KEY=your_key
|
||||
```
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Basic API Health Check
|
||||
|
||||
```python
|
||||
from services.health_checkers import factory
|
||||
|
||||
# Create API checker
|
||||
checker = factory.create_checker("api", timeout=5.0)
|
||||
|
||||
# Check service
|
||||
config = {
|
||||
"url": "https://api.example.com",
|
||||
"health_endpoint": "/health",
|
||||
"enabled": True
|
||||
}
|
||||
result = await checker.check_health("example_service", config)
|
||||
print(f"Status: {result.status}")
|
||||
print(f"Response time: {result.response_time}s")
|
||||
```
|
||||
|
||||
### Sensor-Based Health Check
|
||||
|
||||
```python
|
||||
# Create sensor checker
|
||||
checker = factory.create_checker("sensor", timeout=5.0)
|
||||
|
||||
# Check Home Assistant sensor
|
||||
config = {
|
||||
"url": "https://ha.example.com",
|
||||
"sensor_entity": "sensor.system_uptime",
|
||||
"token": "your_token",
|
||||
"enabled": True
|
||||
}
|
||||
result = await checker.check_health("home_assistant", config)
|
||||
print(f"Uptime: {result.metadata.get('sensor_state')}")
|
||||
```
|
||||
|
||||
### Custom Health Check
|
||||
|
||||
```python
|
||||
# Create custom checker
|
||||
checker = factory.create_checker("custom", timeout=10.0)
|
||||
|
||||
# Check with multiple validations
|
||||
config = {
|
||||
"url": "https://service.example.com",
|
||||
"enabled": True,
|
||||
"health_checks": [
|
||||
{
|
||||
"type": "api",
|
||||
"name": "main_api",
|
||||
"url": "https://service.example.com/api/health"
|
||||
},
|
||||
{
|
||||
"type": "api",
|
||||
"name": "database",
|
||||
"url": "https://service.example.com/api/db/health"
|
||||
}
|
||||
]
|
||||
}
|
||||
result = await checker.check_health("complex_service", config)
|
||||
print(f"Overall status: {result.status}")
|
||||
print(f"Individual checks: {result.metadata.get('check_results')}")
|
||||
```
|
||||
|
||||
## Health Check Results
|
||||
|
||||
### HealthCheckResult Structure
|
||||
|
||||
```python
|
||||
{
|
||||
"status": "healthy|unhealthy|disabled|error|timeout|unauthorized|forbidden",
|
||||
"response_time": 0.123, # seconds
|
||||
"error": "Error message if applicable",
|
||||
"metadata": {
|
||||
"http_status": 200,
|
||||
"response_size": 1024,
|
||||
"sensor_state": "12345",
|
||||
"last_updated": "2024-01-15T10:30:00Z"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Status Values
|
||||
|
||||
- **healthy**: Service is responding normally
|
||||
- **unhealthy**: Service responded but with error status
|
||||
- **disabled**: Service is disabled in configuration
|
||||
- **timeout**: Request timed out
|
||||
- **unauthorized**: Authentication required (HTTP 401)
|
||||
- **forbidden**: Access forbidden (HTTP 403)
|
||||
- **error**: Network or other error occurred
|
||||
|
||||
## Extending the System
|
||||
|
||||
### Adding a New Health Checker
|
||||
|
||||
1. **Create the checker class**:
|
||||
```python
|
||||
from .base import BaseHealthChecker, HealthCheckResult
|
||||
|
||||
class MyCustomChecker(BaseHealthChecker):
|
||||
async def check_health(self, service_name: str, config: Dict) -> HealthCheckResult:
|
||||
# Implementation
|
||||
pass
|
||||
```
|
||||
|
||||
2. **Register the checker**:
|
||||
```python
|
||||
from services.health_checkers import registry
|
||||
|
||||
registry.register("my_custom", MyCustomChecker)
|
||||
```
|
||||
|
||||
3. **Use in configuration**:
|
||||
```python
|
||||
{
|
||||
"health_check_type": "my_custom",
|
||||
"custom_param": "value"
|
||||
}
|
||||
```
|
||||
|
||||
### Service-Specific Logic
|
||||
|
||||
The factory automatically selects the appropriate checker based on:
|
||||
1. `health_check_type` in configuration
|
||||
2. Service name patterns
|
||||
3. Configuration presence (e.g., `sensor_entity` → sensor checker)
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
- **Concurrent Checking**: All services are checked simultaneously
|
||||
- **Checker Caching**: Checkers are cached per service to avoid recreation
|
||||
- **Timeout Management**: Configurable timeouts per checker type
|
||||
- **Resource Cleanup**: Proper cleanup of HTTP clients
|
||||
|
||||
## Monitoring and Logging
|
||||
|
||||
- **Debug Logs**: Detailed operation logs for troubleshooting
|
||||
- **Performance Metrics**: Response times and success rates
|
||||
- **Error Tracking**: Comprehensive error logging with context
|
||||
- **Health Summary**: Overall system health statistics
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Choose Appropriate Checker**: Use the right checker type for your service
|
||||
2. **Set Reasonable Timeouts**: Balance responsiveness with reliability
|
||||
3. **Handle Errors Gracefully**: Always provide meaningful error messages
|
||||
4. **Monitor Performance**: Track response times and success rates
|
||||
5. **Test Thoroughly**: Verify health checks work in all scenarios
|
||||
6. **Document Configuration**: Keep service configurations well-documented
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
1. **Timeout Errors**: Increase timeout or check network connectivity
|
||||
2. **Authentication Failures**: Verify tokens and API keys
|
||||
3. **Sensor Not Found**: Check entity names and permissions
|
||||
4. **Configuration Errors**: Validate service configuration structure
|
||||
|
||||
### Debug Tools
|
||||
|
||||
- **Debug Endpoint**: `/debug/logging` to test logging configuration
|
||||
- **Health Check Logs**: Detailed logs for each health check operation
|
||||
- **Metadata Inspection**: Check metadata for additional context
|
||||
Reference in New Issue
Block a user