Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:safeav:maps:validation [2025/10/23 23:36] – [Localization Validation] momalaen:safeav:maps:validation [2026/03/31 10:14] (current) – [Assessment] airi
Line 2: Line 2:
 {{:en:iot-open:czapka_m.png?50| Masters (2nd level) classification icon }} {{:en:iot-open:czapka_m.png?50| Masters (2nd level) classification icon }}
  
-<todo @bertlluk>CTU: Help neededWe don't know what the intent was behind this chapter.</todo>+<todo @bertlluk>CTU</todo> 
 + 
 +Having designed a sensor, object recognition, and location services section,  how does one test these components.  The fundamentals are consistent with the discussions in chapter 2.  One defines an ODD, builds tests underneath this ODD, applies these tests, and determines correctness. The application of the tests can virtual (simulation), physical (test track), or even a mix based on components (Hardware in Loop or Software in Loop).  The population of tests needs to be complete enough to show sufficient coverage.  The introduction of sensors and AI add significant complexity to this process. 
 + 
 +Testing sensors in safety-critical systems is particularly challenging when viewed through the lens of verification, validation (V&V), and certification, because sensors are both hardware devices and context-dependent measurement systems. Verification—ensuring the sensor meets its design specifications—can be addressed through laboratory calibration, environmental stress testing, and compliance with standards such as ISO 16750 (environmental conditions), DO-160 (avionics), and MIL-STD-810 (defense systems). However, validation—ensuring the sensor performs adequately in real operational contexts—is far more complex. Sensor performance depends heavily on the operational design domain (ODD), including weather, lighting, clutter, and interference conditions, which are difficult to fully replicate or bound. This gap between controlled verification and real-world validation is especially acute for perception sensors (e.g., cameras, radar, lidar), where performance is probabilistic rather than deterministic and strongly influenced by environmental variability.  Today, there is a great deal of innovation in mechanical test apparatus which mimic physical movement inside Anechoic Chambers   to recreate difficult test scenarios.  In the outdoor environment, hives of drones as EM sensors and producers of noise provide a similar function for test tracks. 
 + 
 +^ Conventional Algorithm ^ ML Algorithm ^ Comment ^ 
 +| Logical Theory | No Theory | In conventional algorithms, one needs a theory of operation to implement the solution. ML algorithms can often work without a clear understanding of exactly why they work. | 
 +| Analytical | Not Analytical | Conventional algorithms are accurate in a way we can understand; however, ML algorithms are not easily understood and often behave like a "black box." | 
 +| Causal | Correlation | Conventional algorithms focus on causality, while ML algorithms discover correlations. The difference is important if one wants to reason at higher levels. | 
 +| Deterministic | Non-Deterministic | Conventional algorithms are deterministic in nature, and ML algorithms are fundamentally probabilistic in nature. | 
 +| Known Computational Complexity | Unknown Computational Complexity | Given the analyzable nature of conventional algorithms, one can build a model for computational complexity. This is not always possible for ML techniques, which may require testing to evaluate computational complexity. | 
 + 
 +//**Table 1Contrast of Conventional and Machine Learning Algorithms**// 
 + 
 +The introduction of AI as a replacement for traditional software introduces significant validation issues (table 1) Significantly, the many techniques developed for testing software such as code reviews, code coverage, and static analysis tools.  Further, to test an AI component, it appears to be likely that one must test the method by which it was trained and have access to the training data 
 + 
 +Safety standards across automotive, marine, airborne, and space domains are now evolving to address the introduction of AI/ML-driven functionality, shifting from purely deterministic assurance models toward data-driven and probabilistic validation frameworks. In automotive, traditional functional safety under ISO 26262 has been extended by ISO/PAS 8800 and ISO 21448 to explicitly address perception uncertainty, training data coverage, and performance limitations of AI-based systems. In aviation, guidance such as DO-178C is being supplemented by emerging frameworks like DO-387 (in development) to tackle non-deterministic behavior, explainability, and learning assurance. Similarly, space systems governed by ECSS standards and marine systems guided by International Maritime Organization frameworks are beginning to incorporate autonomy and AI considerations, particularly for unmanned and remotely operated platforms. Across all domains, a common trend is emerging safety assurance is moving from static compliance toward lifecycle-based assurance, including dataset governance, simulation-based validation, runtime monitoring, and continuous certification concepts. This reflects a fundamental shift in safety engineering—from proving correctness of fixed logic to bounding the behavior of adaptive, data-driven systems operating under uncertainty. 
 + 
 +This remainder of this section presents a practical, simulation-driven illustration to validating the perception, mapping (HD maps/digital twins), and localization layers of an autonomous driving stack. The core idea is to anchor tests in the operational design domain (ODD), express them as reproducible scenarios, and report metrics that connect module-level behavior to system-level safety. 
  
-This section presents a practical, simulation-driven approach to validating the perception, mapping (HD maps/digital twins), and localization layers of an autonomous driving stack. The core idea is to anchor tests in the operational design domain (ODD), express them as reproducible scenarios, and report metrics that connect module-level behavior to system-level safety. 
  
 ====== Scope, ODD, and Assurance Frame ====== ====== Scope, ODD, and Assurance Frame ======
Line 10: Line 29:
 We decompose the stack into Perception (object detection/tracking), Mapping (HD map/digital twin creation and consistency), and Localization (GNSS/IMU and vision/LiDAR aiding) and validate each with targeted KPIs and fault injections. The evidence is organized into a safety case that explains how module results compose at system level. Tests are derived from the ODD and instantiated as logical/concrete scenarios (e.g., with a scenario language like Scenic) over the target environment. This gives you systematic coverage and reproducible edge-case generation while keeping hooks for standards-aligned arguments (e.g., ISO 26262/SOTIF) and formal analyses where appropriate. We decompose the stack into Perception (object detection/tracking), Mapping (HD map/digital twin creation and consistency), and Localization (GNSS/IMU and vision/LiDAR aiding) and validate each with targeted KPIs and fault injections. The evidence is organized into a safety case that explains how module results compose at system level. Tests are derived from the ODD and instantiated as logical/concrete scenarios (e.g., with a scenario language like Scenic) over the target environment. This gives you systematic coverage and reproducible edge-case generation while keeping hooks for standards-aligned arguments (e.g., ISO 26262/SOTIF) and formal analyses where appropriate.
  
-====== Perception Validation ======+====== Perception Validation Illustration ======
  
  
Line 25: Line 44:
  
 Figure 1 explains object comparison. Green boxes are shown for objects captured by ground truth, while Red boxes are shown for objects detected by the AV stack. Threshold-based rules are designed to compare the objects. It is expected to provide specific indicators of detectable vehicles in different ranges for safety and danger areas. Figure 1 explains object comparison. Green boxes are shown for objects captured by ground truth, while Red boxes are shown for objects detected by the AV stack. Threshold-based rules are designed to compare the objects. It is expected to provide specific indicators of detectable vehicles in different ranges for safety and danger areas.
-====== Mapping / Digital-Twin Validation ======+====== Mapping / Digital-Twin Validation Illustration ======
  
  
Line 32: Line 51:
 Key checks include lane topology fidelity versus survey, geo-consistency in centimeters, and semantic consistency (e.g., correct placement of occluders, signs, crosswalks). The scenarios used for perception and localization are bound to this twin so that results can be reproduced and shared across teams or vehicles. Over time, you add change-management: detect and quantify drifts when the real world changes (construction, foliage, signage) and re-validate affected scenarios. Key checks include lane topology fidelity versus survey, geo-consistency in centimeters, and semantic consistency (e.g., correct placement of occluders, signs, crosswalks). The scenarios used for perception and localization are bound to this twin so that results can be reproduced and shared across teams or vehicles. Over time, you add change-management: detect and quantify drifts when the real world changes (construction, foliage, signage) and re-validate affected scenarios.
  
-====== Localization Validation ======+====== Localization Validation Illustration ======
  
  
Line 48: Line 67:
  
  
-{{ :en:safeav:maps:localization_val.png?400 |}}+<figure Localization Validation> 
 +{{ :en:safeav:maps:localization_val.png?400 | localization validation}} 
 +<caption> Localization validation, in some cases, the difference between the expected location and the actual location may lead to accidents.</caption> 
 +</figure> 
 + 
 +The current validation methods perform a one-to-one mapping between the expected and actual locations. As shown in Fig. 2, for each frame, the vehicle position deviation is computed and reported in the validation report. Later parameters, like min/max/mean deviations, are calculated from the same report. In the validation procedure, it is also possible to modify the simulator to embed a mechanism to add noise in the localization process to check the robustness and validate its performance. 
 ====== Multi-Fidelity Workflow and Scenario-to-Track Bridge ====== ====== Multi-Fidelity Workflow and Scenario-to-Track Bridge ======
  
  
 A two-stage workflow balances coverage and realism. First, use LF tools (e.g., planner-in-the-loop with simplified sensors and traffic) to sweep large grids of logical scenarios and identify risky regions in parameter space (relative speed, initial gap, occlusion level). Then, promote the most informative concrete scenarios to HF simulation with photorealistic sensors for end-to-end validation of perception and localization interactions. Where appropriate, a small, curated set of scenarios is carried to closed-track trials. Success criteria are consistent across all stages, and post-run analyses attribute failures to perception, localization, prediction, or planning so fixes are targeted rather than generic. A two-stage workflow balances coverage and realism. First, use LF tools (e.g., planner-in-the-loop with simplified sensors and traffic) to sweep large grids of logical scenarios and identify risky regions in parameter space (relative speed, initial gap, occlusion level). Then, promote the most informative concrete scenarios to HF simulation with photorealistic sensors for end-to-end validation of perception and localization interactions. Where appropriate, a small, curated set of scenarios is carried to closed-track trials. Success criteria are consistent across all stages, and post-run analyses attribute failures to perception, localization, prediction, or planning so fixes are targeted rather than generic.
 +
 +====== Summary ======
 +
 +The chapter develops a comprehensive view of perception, mapping, and localization as the foundation of autonomous systems, emphasizing how modern autonomy builds on both historical automation (e.g., autopilots across domains) and recent advances in AI. It explains how perception converts raw sensor data—across cameras, LiDAR, radar, and acoustic systems—into structured understanding through object detection, sensor fusion, and scene interpretation. A key theme is that no single sensor is sufficient; instead, robust autonomy depends on multi-modal sensor fusion, probabilistic estimation, and careful calibration to manage uncertainty. The chapter also highlights the transformative role of AI, particularly deep learning, in enabling scalable perception and scene understanding, while noting that these methods introduce new challenges related to data dependence, generalization, and interpretability.
 +
 +A second major focus is on sources of instability and validation, where the chapter connects environmental effects (weather, electromagnetic interference), infrastructure constraints, and semiconductor economics to system-level performance. It underscores that validation must be grounded in the operational design domain (ODD) and cannot rely solely on physical testing, requiring a combination of simulation, hardware-in-the-loop, and scenario-based methods. The introduction of AI further complicates verification and validation because of its probabilistic, non-deterministic nature, challenging traditional assurance techniques. As a result, safety approaches across domains are evolving toward lifecycle-based assurance, incorporating data governance, simulation-driven testing, and continuous monitoring. The chapter concludes with a structured validation framework that links perception, mapping, and localization performance to system-level safety metrics, emphasizing reproducibility, coverage, and traceability in building a credible safety case.
 +
 +====== Assessment ======
 +
 +^ # ^ Project Title ^ Description ^ Learning Objectives ^
 +| 1 | Multi-Sensor Perception Benchmarking | Build a perception pipeline using at least two sensor modalities (e.g., camera + LiDAR or radar). Evaluate object detection performance under varying conditions (lighting, weather, occlusion) using real or simulated datasets. | Understand strengths/limitations of different sensors. Apply sensor fusion concepts. Evaluate detection metrics (precision/recall, distance sensitivity). Analyze environmental impacts on perception. |
 +| 2 | ODD-Driven Scenario Generation & Validation Study | Define an Operational Design Domain (ODD) for an autonomous system (e.g., urban driving, coastal navigation). Generate a set of test scenarios (including edge cases) and validate system performance using simulation tools. | Define and scope an ODD. Develop scenario-based testing strategies. Understand coverage and edge-case generation. Link scenarios to safety outcomes. |
 +| 3 | Sensor Failure and Degradation Analysis | Simulate sensor failures (e.g., camera blackout, GNSS loss, radar noise) and analyze system-level impact on perception, localization, and safety metrics (e.g., time-to-collision). | Understand failure modes across sensor types. Evaluate system robustness and redundancy. Apply fault injection techniques. Connect sensor degradation to safety risks. |
 +| 4 | AI vs Conventional Algorithm Validation Study | Compare a traditional perception algorithm (e.g., rule-based or classical ML) with a deep learning model on the same dataset. Analyze differences in performance, interpretability, and validation challenges. | Distinguish deterministic vs probabilistic systems. Understand validation challenges of AI/ML. Evaluate explainability and traceability. Assess implications for safety certification. |
 +| 5 | End-to-End V&V Framework Design (Digital Twin) | Design a validation framework for perception, mapping, and localization using simulation (digital twin). Include KPIs, test conditions (e.g., ISO 26262, SOTIF), simulations, and linkage to safety standards. | Design system-level V&V strategies. Define measurable KPIs for autonomy. Understand simulation and digital twin roles. Connect numerical validation to safety standards. |
 +
 +
 +
 +
 +
 +
en/safeav/maps/validation.1761251804.txt.gz · Last modified: by momala
CC Attribution-Share Alike 4.0 International
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0