This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| en:safeav:ctrl:testing [2026/04/23 11:27] – [Cross-Domain Insight] raivo.sell | en:safeav:ctrl:testing [2026/06/03 13:33] (current) – [Closing Note] momala | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| + | ====== Test Methods ====== | ||
| + | |||
| + | Scenario design tells us what should be tested. Test methods determine how those scenarios are executed, what kind of evidence is collected, and how the results are translated into a validation argument. For planning and control, this distinction matters because the same concrete scenario can be exercised in several different ways: first in simulation, then in software-in-the-loop or hardware-in-the-loop settings, then on a controlled test track, and finally in a monitored real-world environment. The book already follows this logic in its current material, where physical testing, real-world seeding, and virtual testing are treated as three complementary ways of generating and executing tests, each with different strengths and limitations. | ||
| + | |||
| + | The central idea is that the scenario from the previous subsection must be brought into a test environment that is suitable for the question being asked. A lane-change scenario, for example, may first be explored in CARLA or another simulator, then repeated with software-in-the-loop or hardware-in-the-loop components, then confirmed at a controlled proving ground such as ZalaZONE, and finally monitored in limited real-world operation. In each case, the test method changes, but the underlying scenario remains the same. That is what makes the validation evidence comparable. | ||
| + | |||
| + | {{ : | ||
| + | |||
| + | |||
| + | ===== Simulation-Based Testing ===== | ||
| + | |||
| + | Simulation is the first and most flexible execution method. It allows the team to test a large number of scenario variants quickly, safely, and repeatably. This is especially important for planning and control, because the behavior of these modules depends strongly on speed, spacing, road geometry, actor behavior, and timing. A simulator can sweep those parameters systematically and expose boundary cases that would be too risky or too expensive to reproduce physically. | ||
| + | |||
| + | The current book already contains a strong simulation toolbox, and that material should be used directly here. For ground systems, CARLA is a natural open-source choice for academic and research work because it supports realistic urban scenes and sensor stacks. NVIDIA DRIVE Sim is useful when the goal is GPU-accelerated synthetic data and digital-twin style validation. IPG CarMaker, dSPACE ASM, VIRES VTD, Applied Intuition, Cognata, and MathWorks tools can be used when the focus shifts toward closed-loop vehicle dynamics, scenario coverage, or industrial validation workflows. These platforms are not identical, and that is part of the point: some are better for scenario breadth, some for sensor realism, some for controller validation, and some for integration with SIL and HIL. | ||
| + | |||
| + | For planning and control, simulation is especially useful when the test objective is one of the following: | ||
| + | - checking whether the planner generates a safe trajectory across many parameter combinations; | ||
| + | - checking whether the controller remains stable under delay, friction change, or uncertain motion; | ||
| + | - identifying which scenario parameters drive unsafe or uncomfortable behavior; | ||
| + | - replaying rare or dangerous edge cases without exposing people or equipment to risk. | ||
| + | |||
| + | {{ : | ||
| + | |||
| + | A practical way to use simulation is to split it into two layers. Low-fidelity simulation is used first to sweep large scenario spaces quickly and identify where safety margins begin to tighten. High-fidelity simulation is then used for the most important cases, where sensor realism, closed-loop dynamics, and timing behavior matter more. The book’s current material already describes this logic: low-fidelity simulation is useful for broad exploration, | ||
| + | |||
| + | ===== Software-in-the-Loop and Hardware-in-the-Loop ===== | ||
| + | |||
| + | Simulation becomes more valuable when the actual autonomy stack is connected to it. In software-in-the-loop testing, the real planning or control software runs inside the virtual environment. This is useful because it tests the actual code while keeping the physical risk low. If the software produces the wrong maneuver, the wrong timing, or the wrong fallback action, the error can be observed in a safe and repeatable setting. | ||
| + | |||
| + | Hardware-in-the-loop adds another layer of realism. It places real or representative hardware into the loop, such as ECUs, data buses, actuator interfaces, or timing elements. This is particularly important for planning and control, because the question is often not only whether the algorithm is correct, but whether the command reaches the vehicle correctly and on time. A planner that works in software may still fail once the actuation path, timing jitter, or bus communication is introduced. | ||
| + | |||
| + | The current manuscript already gives a good example of this in the discussion of virtual ECUs and data buses, where the test rig can simulate bus traffic, counters, checksums, subsystem failures, and graceful degradation. That material fits naturally here because it shows how HiL-style twins help validate actuator-path integrity without requiring a full physical rig. | ||
| + | |||
| + | ===== Test Tracks and Physical Testing ===== | ||
| + | |||
| + | Test tracks are the bridge between simulation and real-world operation. They provide physical realism while preserving a controlled environment in which scenarios can be repeated, instrumented, | ||
| + | |||
| + | One of the ground-systems test track examples is ZalaZONE in Hungary. ZalaZONE includes a Smart City Zone, highway and rural sections, a high-speed oval, dynamic platform, wet and dry handling courses, off-road areas, and V2X/5G infrastructure. It also supports simulation and digital-twin integration through tools such as IPG CarMaker and AVL, making it especially useful for SIL and HIL validation alongside physical track tests. | ||
| + | |||
| + | <!-- Figure comment: ZalaZONE autonomous test track --> | ||
| + | {{ : | ||
| + | |||
| + | Test-track validation is particularly suitable for: | ||
| + | - lane changes and overtaking; | ||
| + | - cut-in and cut-out events; | ||
| + | - obstacle avoidance; | ||
| + | - stop and yield behavior; | ||
| + | - emergency braking; | ||
| + | - localization disturbance checks; | ||
| + | - controller timing and actuation tests. | ||
| + | |||
| + | The strength of a test track is controllability. The same maneuver can be repeated under carefully defined conditions, and the result can be compared against the corresponding simulation case. This makes it possible to isolate whether an unsafe outcome came from the scenario itself, the planner, the controller, the localization path, or the actuation behavior. | ||
| + | |||
| + | The chapter should also keep the existing infrastructure discussion on sensor and EMC testing, because that supports the broader idea of physical validation. Anechoic chambers, fully anechoic chambers, semi-anechoic chambers, RF-shielded rooms, and reverberation chambers are important when sensor behavior, electromagnetic interference, | ||
| + | |||
| + | <!-- Figure comment: anechoic chamber --> | ||
| + | |||
| + | ===== Real-World Testing and Real-World Seeding ===== | ||
| + | |||
| + | Real-world testing is the most demanding method because it captures the actual operational environment. It should therefore be used after the system has already shown acceptable behavior in simulation and on the track. The goal is not to replace simulation or track testing, but to confirm that the validated behavior survives contact with the real world. | ||
| + | |||
| + | The current material gives a useful distinction that should be preserved: one line of validation uses real-world experience as the starting point for further virtual testing, while another line uses the fleet or field itself as a large distributed testbed. The Tesla-style fleet approach is a good example of the first case, where data from the field is fed into a large-scale validation pipeline. Pegasus and the Warwick-related scenario database are good examples of the second, where observed situations are turned into reusable validation material. OpenSCENARIO 2.0 also belongs here because it supports symbolic, reproducible scenario generation based on structured descriptions rather than ad hoc test notes. | ||
| + | |||
| + | This section is also the right place to mention that test generation can be seeded by observed events. Real-world seeding is valuable because it gives the team real situations instead of purely synthetic ones. However, completeness is still an open issue, and there is always a risk that the collected database overrepresents familiar or already-seen conditions. That is why seeding should be treated as a source of test diversity, not as a complete validation solution. | ||
| + | |||
| + | Real-world testing is most useful when the question is: | ||
| + | - does the system remain safe in its intended ODD? | ||
| + | - does the planner and controller remain stable under actual traffic and infrastructure conditions? | ||
| + | - do the simulation and track assumptions still hold once the system is deployed? | ||
| + | |||
| + | ===== Choosing the Right Method ===== | ||
| + | |||
| + | The test method should follow the validation question. | ||
| + | |||
| + | ^ Validation question ^ Best method to start with ^ | ||
| + | | Can the planner produce a safe trajectory across many parameter combinations? | ||
| + | | Does the real software behave correctly in the virtual world? | Software-in-the-loop | | ||
| + | | Does timing, communication, | ||
| + | | Does the system behave correctly under controlled physical conditions? | Test track | | ||
| + | | Does the system remain safe in the intended operating environment? | ||
| + | |||
| + | This is not a rigid ladder. In practice, validation moves back and forth between methods. A track failure may lead to changes in the scenario model or the simulator. A simulation failure may lead to a revised controller or a narrower ODD. A real-world failure may lead to a new safety margin, a changed fallback rule, or a better test-track reproduction of the same case. | ||
| + | |||
| + | The important point is that each method contributes a different kind of evidence. Simulation gives scale, SiL and HiL give integration realism, test tracks give controlled physical confirmation, | ||
| + | |||
| + | ===== Evidence Produced by Testing ===== | ||
| + | |||
| + | The results of planning and control testing should be recorded in a form that can be compared across methods and reused in the safety argument. The most useful evidence is: | ||
| + | - trajectory error; | ||
| + | - tracking error; | ||
| + | - Time-to-Collision and Distance-to-Collision; | ||
| + | - collision or near-miss events; | ||
| + | - maneuver completion time; | ||
| + | - planner latency; | ||
| + | - controller response delay; | ||
| + | - jerk and acceleration comfort measures; | ||
| + | - safe fallback activation; | ||
| + | - repeatability across similar runs. | ||
| + | |||
| + | These outputs should be interpreted together. A maneuver that is accurate but unsafe is not acceptable. A maneuver that is safe but erratic may also be unacceptable if it creates instability or poor comfort. The validation report should therefore link each result back to the scenario definition, the test method, and the original system requirement. | ||
| + | |||
| + | |||
| + | The role of this subsection is to turn scenarios into evidence. Simulation, track testing, and real-world testing are not competing methods; they are complementary layers of the same validation strategy. Simulation gives breadth, physical test tracks give controlled confirmation, | ||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | /* ---- commented ----- | ||
| ====== Physical Testing ====== | ====== Physical Testing ====== | ||
| Line 39: | Line 148: | ||
| Across all four domains, physical testing evolves from **highly repeatable, scenario-rich environments (ground)** to **physics-constrained, | Across all four domains, physical testing evolves from **highly repeatable, scenario-rich environments (ground)** to **physics-constrained, | ||
| + | ===== Testing Infrastructure ===== | ||
| + | < | ||
| + | |||
| + | As discussed earlier, generic V&V process consists of testing the product under test within the ODD. This is generally done with a number of techniques. The central paradigm is to generate a test, execute the test, and have a clear criteria for correctness. | ||
| + | |||
| + | - Physical Testing :Typically, physical scaling is the most expensive method to verify functionality. However, Tesla has built a flow where their existing fleet is a large distributed testbed. | ||
| + | - Real-World Seeding: Another line of test generation is to use physical situations as a seed for further virtual testing. Pegasus, the seminal project initiated in Germany, took such an approach. The project emphasized a scenario-based testing methodology which used observed data from real-world conditions as a base. Another similar effort comes from Warwick University with a focus on test environments, | ||
| + | - Virtual Testing: Another important contribution was ASAM OpenSCENARIO 2.0 which is a domain-specific language designed to enhance the development, | ||
| + | |||
| + | Beyond component validation, there have been proposed solutions specifically for autonomous systems such as UL 4600, " | ||
| + | |||
| + | What kind of testing infrastructure is required to execute on these various methodologies ? | ||
| + | |||
| + | The baseline for automotive physical testing are facilities for crash testing, road variations, and weather effects. | ||
| + | several levels of test infrastructure have emerged around the topics of sensors, test tracks, and virtual simulation. | ||
| + | |||
| + | {{: | ||
| + | Figure: | ||
| + | |||
| + | For sensors, important equipment includes: | ||
| + | - Anechoic Chambers: These chambers are characterized by their anechoic (echo-free) interior, meaning they are designed to completely absorb sound or electromagnetic waves to eliminate reflections from the walls, ceiling, and sometimes the floor. | ||
| + | - Fully Anechoic Chambers (FAC): These chambers have all interior surfaces (walls, ceiling, and floor) covered with RF absorbing materials, creating an environment free from reflections. They are ideal for high-precision measurements like antenna testing or situations where a free-space environment is needed. | ||
| + | - Semi-Anechoic Chambers (SAC): In this type, the walls and ceiling are covered with absorbing materials, while the floor remains reflective (often a metal ground plane). This reflective floor helps simulate real-world environments, | ||
| + | - RF Shielded Rooms (Faraday Cages): These are enclosed rooms designed to block the entry or exit of electromagnetic radiation. They are constructed with a conductive shield (typically copper or other metals) around the walls, ceiling, and floor, minimizing the entry or exit of electromagnetic interference (EMI). They are a fundamental component of many EMI testing facilities. | ||
| + | - Reverberation Chambers: These chambers intentionally use resonances and reflections within the chamber to create a statistically uniform electromagnetic field. They can accommodate larger and more complex test setups and are particularly useful for immunity testing where the device is exposed to interference from all directions. However, their performance can be limited at lower frequencies. | ||
| + | |||
| + | {{: | ||
| + | Figure: Zalazone Autonomous Test Track | ||
| + | |||
| + | In terms of test tracks, traditional test tracks which were used for purposes for mechanical testing have been extended for testing autonomy functions. A recent example shown in the figure above is ZalaZONE, a large test track located in Hungary. | ||
| + | ZalaZONE also provides integration with simulation and digital twin environments. Through platforms such as IPG CarMaker and AVL tools, developers can carry out software-in-the-loop (SIL) and hardware-in-the-loop (HIL) testing in parallel with on-track validation. | ||
| + | |||
| + | {{: | ||
| + | Figure: Carla Simulator | ||
| + | |||
| + | Finally, a great deal of simulation is done virtually. Simulation plays a critical role in the development and validation of autonomous vehicles (AVs), allowing developers to test perception, planning, and control systems in a wide range of scenarios without physical risk. Among the most prominent tools is CARLA, an open-source simulator built for academic and research use, known for its realistic urban environments, | ||
| + | */ | ||