diff --git a/docs/adr/ADR-001-logging-architecture.md b/docs/adr/ADR-001-logging-architecture.md new file mode 100644 index 00000000..242e5191 --- /dev/null +++ b/docs/adr/ADR-001-logging-architecture.md @@ -0,0 +1,384 @@ +# ADR-001: Logging Architecture for PopUp-Sim + +## Quick Reference + +**Decision:** Option 3 (Hybrid) | **Fallback:** Option 1 (if Option 3 rejected) | **Status:** Proposed + +## Contents + +1. [Context](#1-context) +2. [Architecture Options](#2-architecture-options) +3. [Comparison](#3-comparison) +4. [Decision](#4-decision) +5. [Consequences](#5-consequences) +6. [Alternatives Considered](#6-alternatives-considered) +7. [Implementation](#7-implementation) +8. [Migration Path](#8-migration-path) +9. [When to Revisit](#9-when-to-revisit) + +## 1. Context + +PopUp-Sim requires dual-output logging: +1. **Console feedback**: User-facing progress tracking (controlled by `--verbose` flag) +2. **Structured data export**: CSV/JSON for dashboard visualization (independent of console) + +**Key Constraint**: Both outputs must be independently configurable. + +## 2. Architecture Options + +### Option 1: Single Logger + Multiple Handlers + +``` +┌─────────────────────────────────────┐ +│ Simulation Code │ +│ logger.info(event) │ +│ ▼ │ +│ ┌────────────────┐ │ +│ │ SINGLE LOGGER │ │ +│ └────────┬───────┘ │ +│ │ │ +│ ┌────────┼────────┬──────┐ │ +│ ▼ ▼ ▼ ▼ │ +│ Console File CSV JSON │ +│ Handler Handler Handler Handler │ +└─────────────────────────────────────┘ +``` + +**Principle**: ONE logger distributes to MULTIPLE handlers. + +### Option 2: Multiple Independent Loggers + +``` +┌─────────────────────────────────────┐ +│ Simulation Code │ +│ emit_event(event) │ +│ │ │ +│ ┌────────┼────────┬──────┐ │ +│ ▼ ▼ ▼ ▼ │ +│ Logger Logger Logger Logger │ +│ console file csv json │ +│ (propagate=False for all) │ +└─────────────────────────────────────┘ +``` + +**Principle**: FOUR independent loggers, each with own config. + +### Option 3: Hybrid (Logging + Event Collection) + +``` +┌─────────────────────────────────────┐ +│ Simulation Code │ +│ _emit_event(event) │ +│ │ │ +│ ┌─────┴─────┐ │ +│ ▼ ▼ │ +│ Console Event │ +│ Logger Collector │ +│ (logging) (custom) │ +│ │ │ │ +│ ▼ ▼ │ +│ stdout CSV/JSON │ +└─────────────────────────────────────┘ +``` + +**Principle**: Separate systems for different concerns. + +## 3. Comparison + +### Feature Matrix + +| Feature | Option 1 | Option 2 | Option 3 | +|---------|----------|----------|----------| +| **Independence** | ❌ Shared logger | ✅ Full | ✅ Complete | +| **Complexity** | ✅ Simple | ❌ 4 loggers | ⚠️ 2 systems | +| **Standard Python** | ✅ Yes | ✅ Yes | ⚠️ Hybrid | +| **Conceptual Fit** | ❌ Forced | ❌ Forced | ✅ Natural | +| **Extensibility** | ❌ Limited | ⚠️ Medium | ✅ Easy | +| **Verbose Handling** | ⚠️ Global | ⚠️ Per-logger | ✅ Clean | +| **Testing** | ✅ Simple | ❌ Complex | ✅ Simple | +| **Performance** | ⚠️ All handlers | ⚠️ All loggers | ✅ Optimal | + +### Pros & Cons + +| Aspect | Option 1 | Option 2 | Option 3 | +|--------|----------|----------|----------| +| **Pros** | • Standard pattern
• Simple setup
• Built-in features
• Familiar | • Full independence
• Clear separation
• Flexible config | • Right tool for job
• Complete independence
• Easy to extend
• Clear intent | +| **Cons** | • Tight coupling
• Forced paradigm
• Limited flexibility
• Verbose affects all | • Config complexity
• Must set propagate=False
• Still forced paradigm
• 4 loggers to manage | • Two systems
• Events processed twice
• Slightly more code | + +### Verbose Flag Impact + +| Option | Console | Data Export | Complexity | +|--------|---------|-------------|------------| +| **Option 1** | ✅ Controlled | ⚠️ Affected by logger level | Medium | +| **Option 2** | ✅ Per-logger | ⚠️ Must configure each | High | +| **Option 3** | ✅ Controlled | ✅ Unaffected | Low | + +## 4. Decision + +**Status:** Proposed +**Recommended:** Option 3 (Hybrid Approach) +**Fallback:** Option 1 (Single Logger + Multiple Handlers) + +### Rationale + +1. **Separation of Concerns**: Logging ≠ Data Export + - Console output is for humans → use logging + - CSV/JSON export is for machines → use custom collector + +2. **Independence**: Systems don't interfere + - Verbose flag affects only console + - Data export always consistent + - Each system evolves independently + +3. **Extensibility**: Easy to add new exporters + - New exporter = new class + - No changes to logging system + +4. **Clarity**: Code expresses intent + ```python + logger.info("message") # Clear: logging + collector.collect(event) # Clear: data collection + ``` + +### Fallback Recommendation + +**If Option 3 is rejected: Choose Option 1** (simpler than Option 2) + +
+Why Option 1 over Option 2? + +**Advantages:** +- ✅ Simpler setup (1 logger vs 4 loggers) +- ✅ Standard Python pattern (familiar to all developers) +- ✅ Easier testing (mock 1 logger vs 4) +- ✅ Less code to maintain +- ✅ No propagation trap (`propagate=False` easy to forget) +- ✅ Coupling issue has workaround: set logger to DEBUG, control via handler levels + +**Option 1 Workaround:** +```python +logger.setLevel(logging.DEBUG) # Allow all events through +console_handler.setLevel(logging.INFO) # User-facing +csv_handler.setLevel(logging.DEBUG) # Complete data export +# Result: Independent handler control without tight coupling +``` + +**Conclusion:** Option 2's complexity burden (4 loggers, propagation settings, 4x configuration) outweighs Option 1's coupling issue (which has a simple workaround). + +
+ +## 5. Consequences + +**Note:** These consequences apply to **Option 3 (Hybrid Approach)** - the recommended solution. + +### Positive +- ✅ Clean architecture with clear responsibilities +- ✅ Verbose flag cleanly handled (console only) +- ✅ Data export deterministic and consistent +- ✅ Easy to test each system independently +- ✅ Simple to extend with new exporters +- ✅ Type-safe with mypy compliance + +### Negative +- ⚠️ Two systems to maintain (but each simpler) +- ⚠️ Events processed twice (but operations fast) +- ⚠️ Slightly more code (but better organized) + +### Neutral +- Each system can evolve independently +- Team needs to understand both systems +- More files but clearer organization + +## 6. Alternatives Considered + +### Option 1: Single Logger + Multiple Handlers (Fallback Choice) + +
+Why Not Primary Choice? + +**Reasons:** +- **Tight Coupling**: All handlers share same logger level. If logger is set to WARNING, INFO messages never reach ANY handler, even if handler level is DEBUG. +- **Forced Paradigm**: CSV/JSON export conceptually not "logging" but forced into logging framework via custom handlers. +- **Limited Flexibility**: Hard to add non-logging outputs (e.g., database, message queue) without implementing Handler interface. +- **Verbose Flag Issue**: Affects entire logging system; cannot have verbose console + compact file independently without complex formatter management. + +**Example Problem:** +```python +logger.setLevel(logging.WARNING) # Blocks all INFO messages +console_handler.setLevel(logging.INFO) # Never receives INFO +csv_handler.setLevel(logging.DEBUG) # Never receives DEBUG +# Result: Data export incomplete +``` + +
+ +### Option 2: Multiple Independent Loggers (Rejected) + +
+Why Rejected? + +**Reasons:** +- **Configuration Complexity**: Must configure 4 separate loggers (console, file, csv, json), each with own setup method. +- **Propagation Trap**: Must remember `propagate=False` on every logger, otherwise duplicate messages appear. Easy to forget, hard to debug. +- **Still Forced Paradigm**: CSV/JSON "loggers" don't actually log—they write files directly. Using logging.Logger for non-logging is misleading. +- **Testing Overhead**: Must mock 4 loggers instead of 1, increasing test complexity. +- **Maintenance Burden**: 4x configuration code, 4x potential bugs, 4x documentation needed. + +**Example Problem:** +```python +# Forgot propagate=False +console_logger = logging.getLogger('PopupSim.console') +# Result: Messages appear in both console_logger AND root logger (duplicate output) + +# CSV "logger" that doesn't log +csv_logger = logging.getLogger('PopupSim.csv') # Misleading name +self.csv_writer.writerow(data) # Direct file write, not logging +``` + +
+ +## 7. Implementation + +**Note:** This implementation is for **Option 3 (Hybrid Approach)** - the recommended solution. + +### Core Components + +```python +# 1. Event Model +@dataclass +class SimulationEvent: + timestamp: float + event_type: str + entity_id: str + location: str + status: str + +# 2. Console Logger +def setup_console_logger(verbose: bool = False) -> logging.Logger: + logger = logging.getLogger('PopupSim') + # Configure based on verbose flag + return logger + +# 3. Event Collector +class EventCollector: + def collect(self, event: SimulationEvent) -> None: + self.events.append(event) + + def export(self, exporter: DataExporter, path: Path) -> None: + exporter.export(self.events, path) + +# 4. Usage +def _emit_event(self, event: SimulationEvent) -> None: + if self.console_logger: + self.console_logger.info(f"{event.event_type}: {event.entity_id}") + self.event_collector.collect(event) +``` + +### Simulation Prototype + +```python +# main.py +def main(verbose: bool = False, scenario: str = "default") -> None: + engine = SimulationEngine(verbose=verbose) + engine.run() + +# simulation/engine.py +class SimulationEngine: + def __init__(self, verbose: bool = False) -> None: + self.console_logger = setup_console_logger(verbose=verbose) + self.event_collector = EventCollector() + + def run(self) -> None: + # Start simulation + self.console_logger.info("Starting simulation...") + + # Train arrives + event = SimulationEvent( + timestamp=10.5, + event_type="train_arrival", + entity_id="TRAIN-001", + location="WORKSHOP-A", + status="arrived" + ) + self._emit_event(event) + + # Wagon conversion + event = SimulationEvent( + timestamp=25.0, + event_type="wagon_conversion", + entity_id="WAGON-042", + location="TRACK-3", + status="started" + ) + self._emit_event(event) + + # End simulation + self.console_logger.info("Simulation complete") + + # Export data + self.event_collector.export(CSVExporter(), Path("output/events.csv")) + self.event_collector.export(JSONExporter(), Path("output/events.json")) +``` + +**CLI Usage:** +```bash +# Non-verbose mode +$ python main.py --scenario default +INFO - Starting simulation... +INFO - train_arrival: TRAIN-001 +INFO - wagon_conversion: WAGON-042 +INFO - Simulation complete + +# Verbose mode +$ python main.py --scenario default --verbose +2024-01-15 14:30:45 - INFO - Starting simulation... +2024-01-15 14:30:46 - INFO - train_arrival: TRAIN-001 +2024-01-15 14:31:12 - INFO - wagon_conversion: WAGON-042 +2024-01-15 14:35:12 - INFO - Simulation complete +``` + +**Data Export (same for both modes):** +```csv +# events.csv +timestamp,event_type,entity_id,location,status +10.5,train_arrival,TRAIN-001,WORKSHOP-A,arrived +25.0,wagon_conversion,WAGON-042,TRACK-3,started +``` + +```json +# events.json +[{"timestamp": 10.5, "event_type": "train_arrival", ...}, ...] +``` + +### File Structure + +``` +core/logging/ +├── events.py # SimulationEvent dataclass +├── console.py # setup_console_logger() +├── collector.py # EventCollector class +└── exporters.py # CSVExporter, JSONExporter +``` + +## 8. Migration Path + +**Note:** This migration path is for **Option 3 (Hybrid Approach)** - the recommended solution. + +1. Create `core/logging/` structure +2. Implement event model and systems +3. Integrate with main.py (verbose/debug flags) +4. Replace print() with logger.info() and _emit_event() +5. Validate with test scenarios + + +## 9. When to Revisit + +This decision should be revisited if: + +- **Performance Issues**: Dual event processing (logger + collector) causes measurable performance degradation +- **New Requirements**: Need for additional output formats (database, message queue, real-time streaming) +- **Team Feedback**: After 3 months of usage, team reports significant maintenance burden +- **Scale Changes**: Simulation size exceeds 100K events, causing memory issues with in-memory collection +- **Technology Changes**: New Python logging features or libraries that better address the requirements + +**Review Schedule:** 3 months after implementation diff --git a/docs/adr/backup_adr_skeches/ADR-001-option1-single-logger-multiple-handlers.md b/docs/adr/backup_adr_skeches/ADR-001-option1-single-logger-multiple-handlers.md new file mode 100644 index 00000000..3c028148 --- /dev/null +++ b/docs/adr/backup_adr_skeches/ADR-001-option1-single-logger-multiple-handlers.md @@ -0,0 +1,540 @@ +# ADR-001 Option 1: Single Logger with Multiple Handlers + +## Metadata +- **Status**: Considered +- **Date**: 2024-01-15 +- **Decision Makers**: Backend Development Team +- **Related Options**: [Option 2](ADR-001-option2-multiple-loggers.md), [Option 3](ADR-001-option3-hybrid-approach.md) + +## Context and Problem Statement + +### Current State +PopUp-Sim simulates freight rail DAC migration scenarios. Currently, the simulation runs without providing runtime visibility to users. + +### Requirements +Implement a dual-output logging system that provides: + +1. **Console feedback** during simulation runs + - Track simulation progress + - Display events as they occur + - Controlled by `--verbose` flag from main.py + +2. **Structured data export** (CSV/JSON) for post-simulation dashboard visualization + - Complete event history + - Machine-readable format + - Independent of console verbosity + +**Key Constraint**: Both outputs must be independently configurable to allow flexible user customization. + +### Decision Question +Should we use a **single logger with multiple handlers** to distribute events to different outputs? + +## Architecture Design + +### System Overview + +``` +┌─────────────────────────────────────────────────────────────┐ +│ Simulation Code │ +│ │ +│ logger.info(event) │ +│ │ │ +│ ▼ │ +│ ┌──────────────────────┐ │ +│ │ SINGLE LOGGER │ │ +│ │ logging.Logger │ │ +│ │ (name='PopupSim') │ │ +│ └──────────┬───────────┘ │ +│ │ │ +│ ┌───────────────┼───────────────┬──────────┐ │ +│ │ │ │ │ │ +│ ▼ ▼ ▼ ▼ │ +│ ┌────────────┐ ┌────────────┐ ┌──────────┐ ┌─────────┐ │ +│ │ Console │ │ File │ │ CSV │ │ JSON │ │ +│ │ Handler │ │ Handler │ │ Handler │ │ Handler │ │ +│ └─────┬──────┘ └─────┬──────┘ └────┬─────┘ └────┬────┘ │ +│ │ │ │ │ │ +│ ▼ ▼ ▼ ▼ │ +│ stdout app.log events.csv events.json │ +└─────────────────────────────────────────────────────────────┘ + +Key: ONE logger distributes to MULTIPLE handlers +``` + +### Key Principle +**ONE logger instance** with **MULTIPLE handlers** attached. All handlers receive the same LogRecord and process it independently. The logger acts as a central distribution point. + +## Implementation + +### 1. Event Model (`core/logging/events.py`) + +```python +"""Simulation event models.""" + +from dataclasses import dataclass, field, asdict +from typing import Any +import json + + +@dataclass +class SimulationEvent: + """Simulation event.""" + + timestamp: float + event_type: str + entity_id: str + location: str + status: str + duration: float = 0.0 + metadata: dict[str, Any] = field(default_factory=dict) + + def to_log_message(self) -> str: + """Convert to human-readable log message.""" + return f"[{self.timestamp:.2f}] {self.event_type}: {self.entity_id} at {self.location}" + + def to_dict(self) -> dict[str, Any]: + """Convert to dictionary for structured export.""" + return asdict(self) +``` + +### 2. Custom Handlers (`core/logging/handlers.py`) + +```python +"""Custom logging handlers for structured data export.""" + +import csv +import json +import logging +from pathlib import Path +from typing import Any + + +class CSVHandler(logging.Handler): + """Handler that writes events to CSV file.""" + + def __init__(self, filepath: Path) -> None: + """Initialize CSV handler.""" + super().__init__() + self.filepath = filepath + self.file = open(filepath, 'w', newline='', encoding='utf-8') + self.writer: csv.DictWriter[str, Any] | None = None + self.headers_written = False + + def emit(self, record: logging.LogRecord) -> None: + """Write log record to CSV.""" + try: + if hasattr(record, 'event_data'): + event_dict = record.event_data + + if not self.headers_written: + self.writer = csv.DictWriter(self.file, fieldnames=event_dict.keys()) + self.writer.writeheader() + self.headers_written = True + + if self.writer: + self.writer.writerow(event_dict) + self.file.flush() + except Exception: + self.handleError(record) + + def close(self) -> None: + """Close file handler.""" + self.file.close() + super().close() + + +class JSONHandler(logging.Handler): + """Handler that writes events to JSON file.""" + + def __init__(self, filepath: Path) -> None: + """Initialize JSON handler.""" + super().__init__() + self.filepath = filepath + self.events: list[dict[str, Any]] = [] + + def emit(self, record: logging.LogRecord) -> None: + """Collect log record for JSON export.""" + try: + if hasattr(record, 'event_data'): + self.events.append(record.event_data) + except Exception: + self.handleError(record) + + def close(self) -> None: + """Write all events to JSON file.""" + with open(self.filepath, 'w', encoding='utf-8') as f: + json.dump(self.events, f, indent=2) + super().close() +``` + +### 3. Logger Setup (`core/logging/setup.py`) + +```python +"""Logging system setup.""" + +import logging +import sys +from pathlib import Path + +from .handlers import CSVHandler, JSONHandler +from .events import SimulationEvent + + +def setup_logger( + name: str = 'PopupSim', + console_level: str = 'INFO', + verbose: bool = False, + enable_csv: bool = True, + enable_json: bool = True, + output_dir: Path = Path('output') +) -> logging.Logger: + """Configure single logger with multiple handlers.""" + logger = logging.getLogger(name) + logger.setLevel(logging.DEBUG) + logger.handlers.clear() + + # Console handler with verbose formatting + console_handler = logging.StreamHandler(sys.stdout) + console_handler.setLevel(getattr(logging, console_level.upper())) + + if verbose: + console_formatter = logging.Formatter( + '%(asctime)s - %(name)s - %(levelname)s - %(message)s', + datefmt='%Y-%m-%d %H:%M:%S' + ) + else: + console_formatter = logging.Formatter('%(levelname)s - %(message)s') + + console_handler.setFormatter(console_formatter) + logger.addHandler(console_handler) + + # File handler + output_dir.mkdir(parents=True, exist_ok=True) + file_handler = logging.FileHandler(output_dir / 'simulation.log') + file_handler.setLevel(logging.DEBUG) + file_formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s') + file_handler.setFormatter(file_formatter) + logger.addHandler(file_handler) + + # CSV handler + if enable_csv: + csv_handler = CSVHandler(output_dir / 'events.csv') + csv_handler.setLevel(logging.DEBUG) + logger.addHandler(csv_handler) + + # JSON handler + if enable_json: + json_handler = JSONHandler(output_dir / 'events.json') + json_handler.setLevel(logging.DEBUG) + logger.addHandler(json_handler) + + return logger + + +def log_event(logger: logging.Logger, event: SimulationEvent) -> None: + """Log simulation event to all handlers.""" + extra = {'event_data': event.to_dict()} + logger.info(event.to_log_message(), extra=extra) +``` + +## Design Decisions + +### Core Architecture Decision +**Use ONE logger instance** (`logging.Logger`) with **MULTIPLE handlers** attached. Each handler processes the same LogRecord independently. + +### Handler Responsibilities +1. **ConsoleHandler**: User-facing output to stdout +2. **FileHandler**: Developer debug logs to file +3. **CSVHandler**: Structured data export to CSV +4. **JSONHandler**: Structured data export to JSON + +### Data Flow +```python +Simulation Event → logger.info(msg, extra={'event_data': dict}) + ↓ + LogRecord created + ↓ + ┌─────────────┼─────────────┬─────────────┐ + ↓ ↓ ↓ ↓ + Console File CSV JSON + Handler Handler Handler Handler +``` + +### Key Design Choices + +#### 1. Extra Field for Structured Data +**Decision**: Use `extra={'event_data': dict}` to attach structured data to LogRecord. + +**Rationale**: +- Standard Python logging pattern +- Allows handlers to access both message and structured data +- No modification to logging module needed + +**Implementation**: +```python +event_dict = event.to_dict() +logger.info(event.to_log_message(), extra={'event_data': event_dict}) +``` + +#### 2. Custom Handlers for Data Export +**Decision**: Implement custom CSVHandler and JSONHandler classes. + +**Rationale**: +- Built-in handlers only support text output +- Need structured data export (CSV/JSON) +- Must implement logging.Handler interface + +**Trade-off**: Custom code to maintain, but necessary for requirements. + +#### 3. Single Logger Level +**Decision**: Set logger level to DEBUG, control output via handler levels. + +**Rationale**: +- Logger level acts as global minimum +- Each handler can filter independently +- Console can be INFO while file is DEBUG + +**Limitation**: Cannot have different logger levels per handler type. + +## Detailed Argumentation + +### Advantages + +#### 1. Standard Python Approach ✅ +- Uses `logging` module as designed by Python core team +- Well-documented in Python docs +- Familiar to all Python developers +- No external dependencies +- Battle-tested in production systems + +**Example**: Django, Flask, and most Python frameworks use this pattern. + +#### 2. Unified Configuration ✅ +- Single logger instance to configure +- One place to set log levels +- Centralized error handling +- Simple initialization code + +**Code Impact**: +```python +# Simple setup +logger = setup_logger(name='PopupSim', verbose=True) + +# Use everywhere +logger.info("message") +``` + +#### 3. Built-in Features ✅ +- **Thread-safety**: Automatic locking for concurrent access +- **Exception handling**: Built-in error recovery +- **Log level filtering**: Per-handler level control +- **Formatters**: Flexible message formatting +- **Rotation**: File rotation via RotatingFileHandler + +#### 4. Simple Testing ✅ +- Mock single logger instance +- Standard pytest patterns apply +- Use `caplog` fixture for assertions + +**Test Example**: +```python +def test_logging(caplog): + with caplog.at_level(logging.INFO): + logger.info("test") + assert "test" in caplog.text +``` + +### Disadvantages + +#### 1. Tight Coupling ❌ +**Problem**: All handlers share the same logger configuration. + +**Impact**: +- Cannot have completely independent configurations +- Logger level affects all handlers +- Changing logger affects all outputs + +**Example**: +```python +# If logger level is WARNING, INFO messages are lost +logger.setLevel(logging.WARNING) +logger.info("This won't reach ANY handler") +``` + +#### 2. Forced Paradigm ❌ +**Problem**: Data export forced into logging framework. + +**Impact**: +- CSV/JSON export conceptually not "logging" +- Must use LogRecord for data transport +- Awkward fit for pure data collection + +**Example**: +```python +# Using logging for data export feels forced +logger.info("message", extra={'event_data': {...}}) # Awkward +``` + +#### 3. Limited Flexibility ❌ +**Problem**: Hard to add non-logging outputs. + +**Impact**: +- New output types must implement Handler interface +- Cannot easily integrate non-logging systems +- Locked into logging architecture + +**Example**: Adding database export requires custom Handler, not natural fit. + + + +#### 4. Configuration Complexity ❌ +**Problem**: Handler interactions can be confusing. + +**Impact**: +- Must understand handler vs logger levels +- Propagation settings can cause issues +- Formatter conflicts possible + +**Common Bug**: +```python +# Logger level too high - handlers never receive events +logger.setLevel(logging.ERROR) # Blocks INFO/DEBUG +console_handler.setLevel(logging.DEBUG) # Never receives DEBUG +``` + +#### 5. Verbose Flag Limitation ❌ +**Problem**: Verbose affects formatter, but all handlers see same logger. + +**Impact**: +- Cannot have verbose console + compact file log independently +- Verbose mode affects entire logging system +- Must choose one format for all + +**Workaround**: Set formatter per handler, but verbose flag still global. + +## Verbose Flag Handling + +### Integration with main.py + +The `--verbose` flag from main.py controls console output detail level: + +```python +# main.py +def main(verbose: bool = False, debug: str = 'INFO') -> None: + logger = setup_logger( + name='PopupSim', + console_level=debug, + verbose=verbose, + enable_csv=True, + enable_json=True, + output_dir=Path('output') + ) +``` + +### Output Examples + +#### Non-Verbose Mode (`--verbose` not set) +``` +INFO - Starting simulation... +INFO - Train TRAIN-001 arrived at STATION-A +INFO - Wagon WAGON-042 conversion started +INFO - Simulation complete +``` + +#### Verbose Mode (`--verbose` flag set) +``` +2024-01-15 14:30:45 - PopupSim - INFO - Starting simulation... +2024-01-15 14:30:46 - PopupSim - INFO - Train TRAIN-001 arrived at STATION-A +2024-01-15 14:31:12 - PopupSim - INFO - Wagon WAGON-042 conversion started +2024-01-15 14:35:12 - PopupSim - INFO - Simulation complete +``` + +### Implementation Detail + +```python +def setup_logger(verbose: bool = False, ...) -> logging.Logger: + # Console handler formatting based on verbose flag + if verbose: + console_formatter = logging.Formatter( + '%(asctime)s - %(name)s - %(levelname)s - %(message)s', + datefmt='%Y-%m-%d %H:%M:%S' + ) + else: + console_formatter = logging.Formatter('%(levelname)s - %(message)s') + + console_handler.setFormatter(console_formatter) +``` + +### Limitation Analysis + +**Problem**: Verbose flag affects console formatter, but all handlers share same logger. + +**Impact**: +- Cannot have verbose console + compact file log independently +- File handler also affected if using same formatter +- Must set different formatters per handler + +**Workaround**: +```python +# Console: verbose-aware +console_handler.setFormatter(console_formatter) # Based on verbose flag + +# File: always detailed +file_formatter = logging.Formatter( + '%(asctime)s - %(levelname)s - %(funcName)s:%(lineno)d - %(message)s' +) +file_handler.setFormatter(file_formatter) # Independent format +``` + +**Conclusion**: Workaround exists but adds complexity. Each handler needs explicit formatter configuration. + +## Migration Path + +1. **Setup**: Create `core/logging/` with events.py, handlers.py, setup.py +2. **Integration**: Connect to main.py (verbose/debug flags) and SimulationEngine +3. **Migration**: Replace print() with logger.info() and log_event() calls +4. **Validation**: Verify all outputs work correctly with real scenarios + +## Implementation Checklist + +- [ ] Create `core/logging/` directory structure +- [ ] Implement `events.py` with SimulationEvent dataclass +- [ ] Implement `handlers.py` with CSVHandler and JSONHandler +- [ ] Implement `setup.py` with setup_logger() and log_event() +- [ ] Integrate with main.py CLI (verbose, debug flags) +- [ ] Update SimulationEngine to use logger +- [ ] Replace print() with logger.info() and log_event() +- [ ] Verify with: `uv run ruff format . && uv run mypy backend/src/ && uv run pytest` + +## Conclusion + +### Summary +This approach uses **Python's standard logging module with multiple handlers** attached to a single logger instance. It's the most conventional Python approach. + +### Strengths +- ✅ Standard Python pattern +- ✅ Well-documented and understood +- ✅ Built-in features (thread-safety, error handling) +- ✅ Simple testing with pytest + +### Weaknesses +- ❌ Tight coupling between handlers +- ❌ Data export forced into logging paradigm +- ❌ Limited flexibility for non-logging outputs +- ❌ Verbose flag affects all handlers +- ❌ Performance overhead for all handlers + +### Recommendation +This approach works well for **traditional logging use cases** but feels **forced for structured data export**. The tight coupling between console output and data collection makes independent configuration difficult. + +**Use this option if**: +- Team is very familiar with Python logging +- Standard approach is highly valued +- Data export requirements are simple +- Tight coupling is acceptable + +**Avoid this option if**: +- Need complete independence between outputs +- Data export is primary concern +- Flexibility for future extensions is important +- Performance is critical diff --git a/docs/adr/backup_adr_skeches/ADR-001-option2-multiple-loggers.md b/docs/adr/backup_adr_skeches/ADR-001-option2-multiple-loggers.md new file mode 100644 index 00000000..cd719a8e --- /dev/null +++ b/docs/adr/backup_adr_skeches/ADR-001-option2-multiple-loggers.md @@ -0,0 +1,524 @@ +# ADR-001 Option 2: Multiple Independent Loggers + +## Metadata +- **Status**: Considered +- **Date**: 2024-01-15 +- **Decision Makers**: Backend Development Team +- **Related Options**: [Option 1](ADR-001-option1-single-logger-multiple-handlers.md), [Option 3](ADR-001-option3-hybrid-approach.md) + +## Context and Problem Statement + +### Current State +PopUp-Sim simulates freight rail DAC migration scenarios. Currently, the simulation runs without providing runtime visibility to users. + +### Requirements +Implement a dual-output logging system that provides: + +1. **Console feedback** during simulation runs + - Track simulation progress + - Display events as they occur + - Controlled by `--verbose` flag from main.py + +2. **Structured data export** (CSV/JSON) for post-simulation dashboard visualization + - Complete event history + - Machine-readable format + - Independent of console verbosity + +**Key Constraint**: Both outputs must be independently configurable to allow flexible user customization. + +### Decision Question +Should we use **multiple independent logger instances**, each dedicated to a specific output purpose? + +## Architecture Design + +### System Overview + +``` +┌─────────────────────────────────────────────────────────────┐ +│ Simulation Code │ +│ │ +│ emit_event(event) │ +│ │ │ +│ ┌───────────────┼───────────────┬──────────────┐ │ +│ │ │ │ │ │ +│ ▼ ▼ ▼ ▼ │ +│ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ ┌──────────┐│ +│ │ LOGGER │ │ LOGGER │ │ LOGGER │ │ LOGGER ││ +│ │ 'console' │ │ 'file' │ │ 'csv' │ │ 'json' ││ +│ │ │ │ │ │ │ │ ││ +│ │ propagate= │ │ propagate= │ │propagate=│ │propagate=││ +│ │ False │ │ False │ │ False │ │ False ││ +│ └──────┬──────┘ └──────┬──────┘ └────┬─────┘ └────┬─────┘│ +│ │ │ │ │ │ +│ ▼ ▼ ▼ ▼ │ +│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ +│ │ Console │ │ File │ │ CSV │ │ JSON │ │ +│ │ Handler │ │ Handler │ │ Writer │ │ Collector│ │ +│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │ +│ │ │ │ │ │ +│ ▼ ▼ ▼ ▼ │ +│ stdout app.log events.csv events.json │ +└─────────────────────────────────────────────────────────────┘ + +Key: FOUR independent loggers, each with its own handler +``` + +### Key Principle +**MULTIPLE independent logger instances** (PopupSim.console, PopupSim.file, PopupSim.csv, PopupSim.json). Each logger is completely separate with its own configuration. Code must explicitly call each logger. + +## Implementation + +### 1. Event Model (`core/logging/events.py`) + +```python +"""Simulation event models.""" + +from dataclasses import dataclass, field, asdict +from typing import Any + + +@dataclass +class SimulationEvent: + """Simulation event.""" + + timestamp: float + event_type: str + entity_id: str + location: str + status: str + duration: float = 0.0 + metadata: dict[str, Any] = field(default_factory=dict) + + def to_console_message(self) -> str: + """Format for console output.""" + return f"[{self.timestamp:.2f}] {self.event_type}: {self.entity_id} at {self.location}" + + def to_csv_row(self) -> list[Any]: + """Format as CSV row.""" + return [self.timestamp, self.event_type, self.entity_id, + self.location, self.status, self.duration] + + def to_dict(self) -> dict[str, Any]: + """Convert to dictionary.""" + return asdict(self) +``` + +### 2. Logger Manager (`core/logging/manager.py`) + +```python +"""Multi-logger management system.""" + +import csv +import json +import logging +import sys +from pathlib import Path +from typing import Any + +from .events import SimulationEvent + + +class LoggerManager: + """Manages multiple independent loggers.""" + + def __init__( + self, + output_dir: Path = Path('output'), + console_enabled: bool = True, + verbose: bool = False, + csv_enabled: bool = True, + json_enabled: bool = True + ) -> None: + """Initialize logger manager.""" + self.output_dir = output_dir + self.output_dir.mkdir(parents=True, exist_ok=True) + self.verbose = verbose + + self.console_logger = self._setup_console_logger() if console_enabled else None + self.file_logger = self._setup_file_logger() + + # CSV setup + self.csv_enabled = csv_enabled + if csv_enabled: + self.csv_file = open(output_dir / 'events.csv', 'w', newline='', encoding='utf-8') + self.csv_writer = csv.writer(self.csv_file) + self.csv_writer.writerow(['timestamp', 'event_type', 'entity_id', + 'location', 'status', 'duration']) + + # JSON setup + self.json_enabled = json_enabled + self.json_events: list[dict[str, Any]] = [] + + def _setup_console_logger(self) -> logging.Logger: + """Setup console logger.""" + logger = logging.getLogger('PopupSim.console') + logger.setLevel(logging.INFO) + logger.handlers.clear() + logger.propagate = False + + handler = logging.StreamHandler(sys.stdout) + + if self.verbose: + formatter = logging.Formatter( + '%(asctime)s - %(name)s - %(levelname)s - %(message)s', + datefmt='%Y-%m-%d %H:%M:%S' + ) + else: + formatter = logging.Formatter('%(levelname)s - %(message)s') + + handler.setFormatter(formatter) + logger.addHandler(handler) + + return logger + + def _setup_file_logger(self) -> logging.Logger: + """Setup file logger.""" + logger = logging.getLogger('PopupSim.file') + logger.setLevel(logging.DEBUG) + logger.handlers.clear() + logger.propagate = False + + handler = logging.FileHandler(self.output_dir / 'simulation.log') + formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s') + handler.setFormatter(formatter) + logger.addHandler(handler) + + return logger + + def log_event(self, event: SimulationEvent) -> None: + """Log event to all enabled loggers.""" + if self.console_logger: + self.console_logger.info(event.to_console_message()) + + if self.file_logger: + self.file_logger.debug(f"{event.event_type} | {event.entity_id}") + + if self.csv_enabled: + self.csv_writer.writerow(event.to_csv_row()) + self.csv_file.flush() + + if self.json_enabled: + self.json_events.append(event.to_dict()) + + def close(self) -> None: + """Close all loggers and flush data.""" + if self.csv_enabled: + self.csv_file.close() + + if self.json_enabled and self.json_events: + json_path = self.output_dir / 'events.json' + with open(json_path, 'w', encoding='utf-8') as f: + json.dump(self.json_events, f, indent=2) +``` + +## Design Decisions + +### Core Architecture Decision +**Use FOUR independent logger instances**, each with its own configuration, handler, and purpose: +1. `PopupSim.console` - Console output +2. `PopupSim.file` - Debug file logs +3. `PopupSim.csv` - CSV data export (conceptual) +4. `PopupSim.json` - JSON data export (conceptual) + +### Logger Responsibilities +Each logger is completely independent: +- Own namespace in logging hierarchy +- Own handler and formatter +- Own log level +- Own enable/disable flag +- `propagate = False` to prevent parent logger interference + +### Data Flow +```python +Simulation Event → emit_event(event) + ↓ + ┌─────────────┼─────────────┬─────────────┐ + ↓ ↓ ↓ ↓ + console_logger file_logger csv_logger json_logger + .info(msg) .debug(msg) (direct CSV) (collect) + ↓ ↓ ↓ ↓ + stdout app.log events.csv events.json +``` + +### Key Design Choices + +#### 1. Logger Namespace Hierarchy +**Decision**: Use hierarchical names (PopupSim.console, PopupSim.file, etc.) + +**Rationale**: +- Clear organization in logging system +- Easy to identify logger purpose +- Follows Python logging conventions +- Prevents naming conflicts + +**Implementation**: +```python +console_logger = logging.getLogger('PopupSim.console') +file_logger = logging.getLogger('PopupSim.file') +csv_logger = logging.getLogger('PopupSim.csv') +json_logger = logging.getLogger('PopupSim.json') +``` + +#### 2. Propagation Disabled +**Decision**: Set `propagate = False` on all loggers. + +**Rationale**: +- Prevents events from bubbling to parent loggers +- Avoids duplicate log messages +- Ensures complete independence + +**Critical**: Without this, events would propagate to root logger and appear multiple times. + +```python +logger.propagate = False # MUST set this +``` + +#### 3. Direct File Writing for CSV/JSON +**Decision**: CSV/JSON loggers don't actually "log" - they write directly to files. + +**Rationale**: +- Logging paradigm doesn't fit data export well +- Direct file writing is more natural +- Better performance for structured data +- Clearer code intent + +**Trade-off**: Using "logger" name for non-logging is conceptually awkward. + +#### 4. Manager Class Coordination +**Decision**: LoggerManager class coordinates all loggers. + +**Rationale**: +- Single point of configuration +- Manages logger lifecycle +- Handles file closing and flushing +- Simplifies usage in simulation code + +## Detailed Argumentation + +### Advantages + +#### 1. Complete Independence ✅ +**Benefit**: Each logger has its own configuration with zero interference. + +**Impact**: +- Console logger can be INFO level +- File logger can be DEBUG level +- CSV/JSON can be always-on +- No shared state between loggers + +**Example**: +```python +console_logger.setLevel(logging.INFO) # User-facing +file_logger.setLevel(logging.DEBUG) # Developer-facing +# No conflict! +``` + +#### 2. Clear Separation ✅ +**Benefit**: Each logger has single, clear responsibility. + +**Impact**: +- Easy to understand code +- Clear which logger does what +- No confusion about handler purpose +- Follows Single Responsibility Principle + +**Code Clarity**: +```python +self.console_logger.info("User message") # Clear: goes to console +self.file_logger.debug("Debug info") # Clear: goes to file +``` + +#### 3. Flexible Configuration ✅ +**Benefit**: Each logger can have completely different settings. + +**Impact**: +- Different log levels per logger +- Different formatters per logger +- Different handlers per logger +- Independent enable/disable + +**Example**: +```python +# Console: simple format, INFO level +console_logger: INFO, format='%(message)s' + +# File: detailed format, DEBUG level +file_logger: DEBUG, format='%(asctime)s - %(funcName)s:%(lineno)d - %(message)s' +``` + +#### 4. Namespace Isolation ✅ +**Benefit**: No naming conflicts in logging hierarchy. + +**Impact**: +- Clear logger identification +- Easy to filter logs by logger name +- No accidental logger reuse +- Follows Python logging best practices + +**Hierarchy**: +``` +PopupSim (root) + ├── console + ├── file + ├── csv + └── json +``` + +### Disadvantages + +#### 1. Still Coupled to Logging ❌ +**Problem**: CSV/JSON export forced into logging paradigm even though it's not really logging. + +**Impact**: +- Conceptual mismatch: data export ≠ logging +- Using logging.Logger for non-logging tasks +- Awkward code that doesn't express intent +- Misleading for future developers + +**Example**: +```python +# This "logger" doesn't actually log - it writes CSV +self.csv_logger = logging.getLogger('PopupSim.csv') # Misleading name +``` + +#### 2. Configuration Complexity ❌ +**Problem**: Must configure four separate loggers. + +**Impact**: +- More setup code required +- More parameters to manage +- More potential for configuration errors +- Harder to maintain consistency + +**Code Volume**: +```python +# Must setup each logger individually +self._setup_console_logger() +self._setup_file_logger() +self._setup_csv_logger() +self._setup_json_logger() +# 4x the configuration code +``` + +#### 3. Propagation Issues ❌ +**Problem**: Must remember to set `propagate = False` on every logger. + +**Impact**: +- Easy to forget +- Causes duplicate messages if forgotten +- Hard to debug when it happens +- Not obvious to new developers + +**Common Bug**: +```python +logger = logging.getLogger('PopupSim.console') +# Forgot: logger.propagate = False +# Result: Messages appear twice (console + root logger) +``` + +#### 4. Awkward Data Export ❌ +**Problem**: CSV/JSON "loggers" don't actually log. + +**Impact**: +- Misleading code structure +- Logger used for file writing +- Doesn't leverage logging features +- Better done with custom classes + +**Reality**: +```python +# This isn't logging, it's just file writing +if self.csv_enabled: + self.csv_writer.writerow(event.to_csv_row()) # Direct file write + # Why use a "logger" for this? +``` + + + +#### 5. Verbose Flag Complexity ❌ +**Problem**: Must pass verbose flag to each logger setup individually. + +**Impact**: +- More parameters to pass around +- More configuration code +- Easy to forget for one logger +- Inconsistent behavior if forgotten + +**Configuration**: +```python +def __init__(self, verbose: bool = False): + self.verbose = verbose + self.console_logger = self._setup_console_logger() # Uses self.verbose + self.file_logger = self._setup_file_logger() # Might also use it + # Must remember to use verbose in each setup method +``` + +## Verbose Flag Handling + +The `--verbose` flag from main.py can be handled independently per logger: + +```python +# Console logger can be verbose +self.console_logger = self._setup_console_logger(verbose=True) + +# File logger can have different format +self.file_logger = self._setup_file_logger(verbose=False) +``` + +**Advantage**: Each logger can have independent verbose settings. Console can be verbose while file logs remain compact. + +**Disadvantage**: Must pass verbose flag to each logger setup method, increasing configuration complexity. + +## Migration Path + +1. **Setup**: Create `core/logging/` with events.py and manager.py (LoggerManager class) +2. **Integration**: Connect to main.py (verbose/debug flags) and SimulationEngine +3. **Migration**: Replace print() with manager.log_event() calls +4. **Validation**: Verify logger independence and propagate=False settings + +## Implementation Checklist + +- [ ] Create `core/logging/` directory structure +- [ ] Implement `events.py` with SimulationEvent dataclass +- [ ] Implement `manager.py` with LoggerManager class +- [ ] Implement _setup_console_logger() and _setup_file_logger() with propagate=False +- [ ] Implement CSV/JSON writing logic in LoggerManager +- [ ] Integrate with main.py CLI (verbose, debug flags) +- [ ] Update SimulationEngine to use LoggerManager +- [ ] Replace print() with manager.log_event() +- [ ] Verify with: `uv run ruff format . && uv run mypy backend/src/ && uv run pytest` + +## Conclusion + +### Summary +This approach uses **multiple independent logger instances**, each dedicated to a specific output. It provides better independence than Option 1 but still forces data export into the logging paradigm. + +### Strengths +- ✅ Complete independence between loggers +- ✅ Clear separation of concerns +- ✅ Flexible configuration per logger +- ✅ Namespace isolation +- ✅ Verbose flag can be per-logger + +### Weaknesses +- ❌ Still couples data export to logging +- ❌ Configuration complexity (4 loggers) +- ❌ Must remember propagate=False +- ❌ CSV/JSON "loggers" don't actually log +- ❌ Testing complexity (mock 4 loggers) +- ❌ More code to maintain + +### Recommendation +This approach provides **better independence** than Option 1 but **doesn't solve the fundamental problem**: data export is not logging. The multiple loggers add complexity without addressing the conceptual mismatch. + +**Use this option if**: +- Complete logger independence is critical +- Team comfortable with multiple logger management +- Willing to accept logging paradigm for data export +- Configuration complexity is acceptable + +**Avoid this option if**: +- Data export is primary concern +- Want conceptually clean separation +- Prefer simpler configuration +- Team unfamiliar with propagate settings diff --git a/docs/adr/backup_adr_skeches/ADR-001-option3-hybrid-approach.md b/docs/adr/backup_adr_skeches/ADR-001-option3-hybrid-approach.md new file mode 100644 index 00000000..09aec652 --- /dev/null +++ b/docs/adr/backup_adr_skeches/ADR-001-option3-hybrid-approach.md @@ -0,0 +1,671 @@ +# ADR-001 Option 3: Hybrid Approach (Logging + Event Collection) + +## Metadata +- **Status**: Recommended +- **Date**: 2024-01-15 +- **Decision Makers**: Backend Development Team +- **Related Options**: [Option 1](ADR-001-option1-single-logger-multiple-handlers.md), [Option 2](ADR-001-option2-multiple-loggers.md) + +## Context and Problem Statement + +### Current State +PopUp-Sim simulates freight rail DAC migration scenarios. Currently, the simulation runs without providing runtime visibility to users. + +### Requirements +Implement a dual-output logging system that provides: + +1. **Console feedback** during simulation runs + - Track simulation progress + - Display events as they occur + - Controlled by `--verbose` flag from main.py + +2. **Structured data export** (CSV/JSON) for post-simulation dashboard visualization + - Complete event history + - Machine-readable format + - Independent of console verbosity + +**Key Constraint**: Both outputs must be independently configurable to allow flexible user customization. + +### Decision Question +Should we **separate logging from data export** by using Python logging for human-readable output and a custom event collection system for structured data? + +## Architecture Design + +### System Overview + +``` +┌─────────────────────────────────────────────────────────────┐ +│ PopupSim │ +│ │ +│ ┌────────────────────────────────────────────────────┐ │ +│ │ Simulation Event Emission │ │ +│ │ _emit_event() │ │ +│ └──────────────────┬──────────────────────────────────┘ │ +│ │ │ +│ ┌───────────┴───────────┐ │ +│ │ │ │ +│ ▼ ▼ │ +│ ┌─────────────┐ ┌─────────────┐ │ +│ │ Console │ │ Event │ │ +│ │ Logger │ │ Collector │ │ +│ │ (logging) │ │ (custom) │ │ +│ └──────┬──────┘ └──────┬──────┘ │ +│ │ │ │ +│ ▼ ▼ │ +│ ┌────────┐ ┌──────────────┐ │ +│ │ stdout │ │ Exporters │ │ +│ └────────┘ │ CSV | JSON │ │ +│ └──────┬───────┘ │ +│ ▼ │ +│ ┌──────────────┐ │ +│ │ Data Files │ │ +│ └──────────────┘ │ +└─────────────────────────────────────────────────────────────┘ +``` + +### Key Principle +**Separation of Concerns**: Use each technology for what it's designed for: +- **Python logging**: Human-readable console/file output +- **Custom event collector**: Structured data export (CSV/JSON) + +These are fundamentally different concerns and should be separate systems. + +## Implementation + +### 1. Event Model (`core/logging/events.py`) + +```python +"""Simulation event models.""" + +from dataclasses import dataclass, field, asdict +from typing import Any + + +@dataclass +class SimulationEvent: + """Simulation event.""" + + timestamp: float + event_type: str + entity_id: str + location: str + status: str + duration: float = 0.0 + metadata: dict[str, Any] = field(default_factory=dict) + + def to_dict(self) -> dict[str, Any]: + """Convert to dictionary.""" + return asdict(self) +``` + +### 2. Console Logger (`core/logging/console.py`) + +```python +"""Console logging configuration.""" + +import logging +import sys + + +def setup_console_logger( + name: str = 'PopupSim', + level: str = 'INFO', + verbose: bool = False +) -> logging.Logger: + """Configure console logger.""" + logger = logging.getLogger(name) + logger.setLevel(getattr(logging, level.upper())) + logger.handlers.clear() + + handler = logging.StreamHandler(sys.stdout) + + if verbose: + formatter = logging.Formatter( + '%(asctime)s - %(levelname)s - %(message)s', + datefmt='%Y-%m-%d %H:%M:%S' + ) + else: + formatter = logging.Formatter('%(levelname)s - %(message)s') + + handler.setFormatter(formatter) + logger.addHandler(handler) + + return logger +``` + +### 3. Event Collector (`core/logging/collector.py`) + +```python +"""Event collection system for structured data export.""" + +from pathlib import Path +from typing import Protocol + +from .events import SimulationEvent + + +class DataExporter(Protocol): + """Protocol for data exporters.""" + + def export(self, events: list[SimulationEvent], output_path: Path) -> None: + """Export events to file.""" + ... + + +class EventCollector: + """Collects simulation events for structured data export.""" + + def __init__(self) -> None: + """Initialize event collector.""" + self.events: list[SimulationEvent] = [] + + def collect(self, event: SimulationEvent) -> None: + """Collect an event.""" + self.events.append(event) + + def export(self, exporter: DataExporter, output_path: Path) -> None: + """Export collected events.""" + exporter.export(self.events, output_path) + + def clear(self) -> None: + """Clear collected events.""" + self.events.clear() +``` + +### 4. Exporters (`core/logging/exporters.py`) + +```python +"""Data exporters for simulation events.""" + +import csv +import json +from pathlib import Path + +from .events import SimulationEvent + + +class CSVExporter: + """Export events to CSV format.""" + + def export(self, events: list[SimulationEvent], output_path: Path) -> None: + """Export events to CSV file.""" + output_path.parent.mkdir(parents=True, exist_ok=True) + + with open(output_path, 'w', newline='', encoding='utf-8') as f: + if not events: + return + + writer = csv.DictWriter(f, fieldnames=events[0].to_dict().keys()) + writer.writeheader() + + for event in events: + writer.writerow(event.to_dict()) + + +class JSONExporter: + """Export events to JSON format.""" + + def export(self, events: list[SimulationEvent], output_path: Path) -> None: + """Export events to JSON file.""" + output_path.parent.mkdir(parents=True, exist_ok=True) + + with open(output_path, 'w', encoding='utf-8') as f: + json.dump([event.to_dict() for event in events], f, indent=2) +``` + +### 5. Usage Example (`simulation/engine.py`) + +```python +"""Simulation engine with hybrid logging.""" + +from pathlib import Path + +from core.logging.console import setup_console_logger +from core.logging.collector import EventCollector +from core.logging.exporters import CSVExporter, JSONExporter +from core.logging.events import SimulationEvent + + +class SimulationEngine: + """Simulation engine.""" + + def __init__( + self, + output_dir: Path = Path('output'), + console_enabled: bool = True, + verbose: bool = False, + csv_enabled: bool = True, + json_enabled: bool = True + ) -> None: + """Initialize simulation engine.""" + self.output_dir = output_dir + self.console_logger = setup_console_logger(verbose=verbose) if console_enabled else None + self.event_collector = EventCollector() + self.csv_enabled = csv_enabled + self.json_enabled = json_enabled + + def _emit_event(self, event: SimulationEvent) -> None: + """Emit event to both logger and collector.""" + if self.console_logger: + self.console_logger.info(f"{event.event_type}: {event.entity_id}") + + self.event_collector.collect(event) + + def run(self) -> None: + """Run simulation.""" + if self.console_logger: + self.console_logger.info("Starting simulation...") + + event = SimulationEvent( + timestamp=0.0, + event_type='train_arrival', + entity_id='TRAIN-001', + location='STATION-A', + status='arrived', + duration=120.0 + ) + + self._emit_event(event) + + if self.console_logger: + self.console_logger.info("Simulation complete") + + # Export collected data + if self.csv_enabled: + self.event_collector.export( + CSVExporter(), + self.output_dir / 'events.csv' + ) + + if self.json_enabled: + self.event_collector.export( + JSONExporter(), + self.output_dir / 'events.json' + ) +``` + +## Design Decisions + +### Core Architecture Decision +**Use TWO independent systems**: +1. **Python logging** (`logging.Logger`) for console/file output +2. **Custom EventCollector** for structured data export + +### System Responsibilities + +#### Logging System (Python logging) +- Console output for users +- File logs for developers +- Human-readable messages +- Controlled by verbose/debug flags + +#### Event Collection System (Custom) +- Collect SimulationEvent objects +- Export to CSV/JSON formats +- Structured data for analysis +- Independent of logging configuration + +### Data Flow +```python +Simulation Event → _emit_event(event) + ↓ + ┌─────────────┼─────────────┐ + ↓ ↓ + LOGGING SYSTEM EVENT COLLECTION + ↓ ↓ + logger.info() collector.collect() + ↓ ↓ + stdout list.append() + ↓ + exporters.export() + ↓ + CSV/JSON files +``` + +### Key Design Choices + +#### 1. Separate Systems, Not Integrated +**Decision**: Logging and data collection are completely separate. + +**Rationale**: +- Logging is for human consumption +- Data export is for machine consumption +- Different purposes = different systems +- No forced integration + +**Benefit**: Each system can evolve independently. + +#### 2. Protocol-Based Exporters +**Decision**: Use Protocol (typing.Protocol) for exporter interface. + +**Rationale**: +- Type-safe without inheritance +- Easy to add new exporters +- Duck typing with type checking +- Follows Python best practices + +**Implementation**: +```python +class DataExporter(Protocol): + def export(self, events: list[SimulationEvent], output_path: Path) -> None: + ... +``` + +#### 3. In-Memory Collection, Batch Export +**Decision**: Collect events in memory, export at end. + +**Rationale**: +- Simple implementation +- Fast collection (list.append is O(1)) +- No file I/O during simulation +- Export happens after simulation completes + +**Trade-off**: Memory usage for large simulations (acceptable for 10,000 events). + +#### 4. Event Dataclass +**Decision**: Use @dataclass for SimulationEvent. + +**Rationale**: +- Automatic __init__, __repr__, __eq__ +- Type hints enforced +- Easy to convert to dict with asdict() +- Immutable with frozen=True (optional) + +**Benefit**: Less boilerplate, more maintainable. + +## Detailed Argumentation + +### Advantages + +#### 1. Complete Independence ✅ +**Benefit**: Logging and data export are fully decoupled. + +**Impact**: +- Console logger can be enabled/disabled independently +- Data export can be enabled/disabled independently +- Verbose flag affects only console +- Debug flag affects only logging +- No interference between systems + +**Example**: +```python +# Console off, data export on +engine = SimulationEngine( + console_enabled=False, # No console output + csv_enabled=True, # Still get CSV + json_enabled=True # Still get JSON +) +``` + +#### 2. Right Tool for Job ✅ +**Benefit**: Each system does what it's designed for. + +**Impact**: +- Logging module used for logging (its purpose) +- Custom collector used for data collection (its purpose) +- No conceptual mismatch +- Code expresses intent clearly + +**Clarity**: +```python +# Clear: this is logging +self.console_logger.info("Starting simulation...") + +# Clear: this is data collection +self.event_collector.collect(event) +``` + +#### 3. Easy to Extend ✅ +**Benefit**: Add new exporters without touching logging. + +**Impact**: +- New exporter = new class implementing Protocol +- No changes to logging system +- No changes to event collector +- Just add new exporter class + +**Example**: +```python +# Add Parquet exporter +class ParquetExporter: + def export(self, events: list[SimulationEvent], output_path: Path) -> None: + # Implementation + pass + +# Use it +self.event_collector.export(ParquetExporter(), output_path) +``` + +#### 4. No Coupling ✅ +**Benefit**: Systems don't interfere with each other. + +**Impact**: +- Logging changes don't affect data export +- Data export changes don't affect logging +- Can test each system independently +- Can replace either system without affecting the other + +**Independence**: +```python +# Change logging format - data export unaffected +logger = setup_console_logger(verbose=True) # Changed +self.event_collector.collect(event) # Still works exactly the same +``` + + + +#### 5. Clear Intent ✅ +**Benefit**: Code clearly shows what's logging vs data collection. + +**Impact**: +- Easy to understand for new developers +- No confusion about purpose +- Self-documenting code +- Follows principle of least surprise + +**Readability**: +```python +def _emit_event(self, event: SimulationEvent) -> None: + # Logging: human-readable message + if self.console_logger: + self.console_logger.info(f"{event.event_type}: {event.entity_id}") + + # Data collection: structured event + self.event_collector.collect(event) +``` + +### Disadvantages + +#### 1. Two Systems ⚠️ +**Trade-off**: More code to maintain (two separate systems). + +**Impact**: +- Two codebases instead of one +- Two sets of tests +- Two systems to understand +- More files in project + +**Mitigation**: +- Each system is simpler than integrated solution +- Clear separation makes maintenance easier +- Total complexity is lower + +**Reality**: This is a feature, not a bug. Separation of concerns is good design. + +#### 2. Events Processed Twice ⚠️ +**Trade-off**: Each event sent to both logger and collector. + +**Impact**: +- Two function calls per event: logger.info() + collector.collect() +- Appears to be "duplicate work" +- Both operations are simple and fast + +#### 3. Slightly More Code ⚠️ +**Trade-off**: Two implementations instead of one. + +**Impact**: +- More lines of code +- More files to navigate +- Slightly larger codebase + +**Benefit**: +- Each implementation is simpler +- Easier to understand individually +- Lower cognitive load per component + +### Verbose Flag Advantages + +#### 1. Clean Separation ✅ +**Benefit**: Verbose only affects console logger, never data export. + +**Impact**: +- Data export files are always consistent +- CSV/JSON format never changes +- Dashboard can rely on stable format +- No surprises in exported data + +**Guarantee**: +```python +# Verbose or not, CSV/JSON are IDENTICAL +engine1 = SimulationEngine(verbose=False) +engine2 = SimulationEngine(verbose=True) +# Both produce identical events.csv and events.json +``` + +#### 2. Simple Configuration ✅ +**Benefit**: Single parameter passed to console logger setup. + +**Impact**: +- Easy to understand +- Easy to implement +- No complex flag passing +- Clear code flow + +**Simplicity**: +```python +self.console_logger = setup_console_logger(verbose=verbose) +# That's it. Event collector doesn't even know about verbose. +``` + +#### 3. No Side Effects ✅ +**Benefit**: Event collector completely unaffected by verbose mode. + +**Impact**: +- Data export is deterministic +- No hidden dependencies +- Easy to reason about +- Testable independently + +**Independence**: +```python +# Verbose affects this +self.console_logger.info(message) # Format changes + +# Verbose does NOT affect this +self.event_collector.collect(event) # Always the same +``` + +#### 4. Clear Intent ✅ +**Benefit**: Verbose is for human output, data export remains structured. + +**Impact**: +- Matches user expectations +- Verbose = more detail for humans +- Data export = always complete for machines +- No confusion about purpose + +## Verbose Flag Handling + +The `--verbose` flag from main.py affects **only the console logger**: + +```python +# Non-verbose console output +INFO - Starting simulation... +INFO - train_arrival: TRAIN-001 +INFO - Simulation complete + +# Verbose console output +2024-01-15 14:30:45 - INFO - Starting simulation... +2024-01-15 14:30:46 - INFO - train_arrival: TRAIN-001 +2024-01-15 14:35:12 - INFO - Simulation complete + +# Data export (ALWAYS the same, unaffected by verbose) +events.csv: timestamp,event_type,entity_id,location,status,duration +events.json: [{"timestamp": 0.0, "event_type": "train_arrival", ...}] +``` + +**Key Benefit**: Verbose flag has **zero impact** on data export. CSV/JSON files remain consistent regardless of console verbosity. + +## Migration Path + +1. **Setup**: Create `core/logging/` with events.py, console.py, collector.py, exporters.py +2. **Integration**: Connect to main.py (verbose/debug flags) and SimulationEngine +3. **Migration**: Replace print() with logger.info(), add _emit_event() for data collection +4. **Validation**: Verify system independence and verbose flag doesn't affect data export + +## Implementation Checklist + +- [ ] Create `core/logging/` directory structure +- [ ] Implement `events.py` with SimulationEvent dataclass +- [ ] Implement `console.py` with setup_console_logger() +- [ ] Implement `collector.py` with EventCollector class +- [ ] Implement `exporters.py` with CSVExporter and JSONExporter +- [ ] Integrate with main.py CLI (verbose, debug flags) +- [ ] Update SimulationEngine with console_logger and event_collector +- [ ] Implement _emit_event() method in SimulationEngine +- [ ] Replace print() with logger.info() and _emit_event() +- [ ] Verify with: `uv run ruff format . && uv run mypy backend/src/ && uv run pytest` + +## Conclusion + +### Summary +This approach provides **complete separation of concerns** by using: +- **Python logging** for human-readable console/file output +- **Custom event collection** for structured data export (CSV/JSON) + +These are fundamentally different concerns and deserve separate systems. + +### Strengths +- ✅ Complete independence between logging and data export +- ✅ Right tool for the job (logging for logs, custom for data) +- ✅ Easy to extend with new exporters +- ✅ No coupling between systems +- ✅ Optimal performance for each use case +- ✅ Clear code intent +- ✅ Verbose flag cleanly handled (affects only console) +- ✅ Data export always consistent +- ✅ Simple testing (test each system independently) +- ✅ Flexible configuration + +### Weaknesses +- ⚠️ Two systems to maintain (but each is simpler) +- ⚠️ Events processed twice (but both operations are fast) +- ⚠️ Slightly more code (but better organized) + +### Recommendation +This is the **preferred option** for PopUp-Sim. + +**Use this option because**: +- ✅ Clean separation of concerns +- ✅ Each system does what it's designed for +- ✅ Easy to understand and maintain +- ✅ Flexible and extensible +- ✅ Verbose flag handled correctly +- ✅ Data export is deterministic +- ✅ Performance is excellent +- ✅ Testing is straightforward +- ✅ Follows Python best practices +- ✅ Future-proof architecture + +**This option provides**: +1. **For Users**: Clear console output with optional verbose mode +2. **For Dashboard Team**: Consistent CSV/JSON data, unaffected by console settings +3. **For Developers**: Clean architecture, easy to extend, simple to test +4. **For Project**: Maintainable code, follows standards, type-safe + +### Next Steps +1. Review this ADR with team +2. Get approval from stakeholders +3. Begin Phase 1 implementation +4. Follow migration path outlined above +5. Monitor and iterate based on feedback diff --git a/popupsim/config/examples/small_examples/scenario.json b/popupsim/config/examples/small_examples/scenario.json new file mode 100644 index 00000000..8cf93e1d --- /dev/null +++ b/popupsim/config/examples/small_examples/scenario.json @@ -0,0 +1,29 @@ +{ + "scenario_id": "scenario_001", + "start_date": "2024-01-15", + "end_date": "2024-01-16", + "random_seed": 42, + "workshop": { + "tracks": [ + { + "id": "TRACK01", + "function": "werkstattgleis", + "capacity": 5, + "retrofit_time_min": 30 + }, + { + "id": "TRACK02", + "function": "werkstattgleis", + "capacity": 3, + "retrofit_time_min": 45 + }, + { + "id": "TRACK03", + "function": "werkstattgleis", + "capacity": 4, + "retrofit_time_min": 35 + } + ] + }, + "train_schedule_file": "train_schedule.csv" +} diff --git a/popupsim/config/examples/small_examples/workshop_tracks.csv b/popupsim/config/examples/small_examples/workshop_tracks.csv new file mode 100644 index 00000000..d199376a --- /dev/null +++ b/popupsim/config/examples/small_examples/workshop_tracks.csv @@ -0,0 +1,8 @@ +track_id,function,capacity,retrofit_time_min +TRACK01,werkstattgleis,5,30 +TRACK02,werkstattgleis,3,45 +TRACK03,sammelgleis,10,0 +TRACK04,parkgleis,8,0 +TRACK05,werkstattzufuehrung,2,0 +TRACK06,werkstattabfuehrung,2,0 +TRACK07,bahnhofskopf,3,0