Validation Requirements across Domains

[pczekalski]Introduce cybersecurity chapter - add V&V relation for cybersecurity

In terms of domains, the Operational Design Domain (ODD) is the driving factor, and typically has two dimensions. The first is the operational model, and the second is the physical domain (ground, airborne, marine, space). On the ground, Passenger AVs are perhaps the most well-known face of autonomy, with robo-taxi services and self-driving consumer vehicles gradually entering urban environments. Companies like Waymo, Cruise, and Tesla have taken different approaches to ODDs. Waymo’s fully driverless cars operate in sunny, geo-fenced suburbs of Phoenix with detailed mapping and remote supervision. Cruise began service in San Francisco, originally operating only at night to reduce complexity. Tesla’s Full Self Driving (FSD) Beta aims for broader generalization, but it still relies heavily on driver supervision and is limited by weather and visibility challenges.

Transit shuttles, though less publicized, have quietly become a practical application of AVs in controlled environments. These low-speed vehicles typically operate in geo-fenced areas such as university campuses, airports, or business parks. Companies like Navya, Beep, and EasyMile deploy shuttles that follow fixed routes and schedules, interacting minimally with complex traffic scenarios. Their ODDs are tightly defined: they may not operate in rain or snow, often run only during daylight, and avoid high-speed or mixed-traffic conditions. In many cases, a remote operator monitors operations or is available to intervene if needed. Delivery robots represent a third class of autonomous mobility—compact, lightweight vehicles designed for last-mile delivery. Their ODDs are perhaps the narrowest, but that’s by design. These robots, from companies like Starship, Kiwibot, and Nuro, navigate sidewalks, crosswalks, and short street segments in suburban or campus environments. They operate at pedestrian speeds (typically under 10 mph), carry small payloads, and avoid extreme weather, high traffic, or unstructured terrain. Because they don’t carry passengers, safety thresholds and regulatory oversight can differ significantly.

Weather is a particularly limiting factor across all autonomous systems. Rain, snow, fog, and glare interfere with LIDAR, radar, and camera performance—especially for smaller robots that operate close to the ground. Most AV deployments today restrict operations to fair-weather conditions. This is especially true for delivery robots and transit shuttles, which often halt operations during storms. While advanced sensor fusion and predictive modeling promise improvements, true all-weather autonomy remains a significant technical challenge. The intersection of weather and autonomy is an active research area [1]

Another ODD dimension is time of day. Nighttime operation brings unique difficulties for AVs: reduced visibility, increased pedestrian unpredictability, and in urban areas, more erratic driver behavior. Some systems (like Waymo in Chandler, AZ) now operate 24/7, but most deployments—particularly delivery robots and shuttles—remain restricted to daylight hours. Tesla's FSD does operate at night, but it still requires human oversight. Infrastructure also shapes ODDs in crucial ways. Many AV systems depend on high-definition maps, lane-level GPS, and even smart traffic signals to guide their decisions. In geo-fenced environments—where the route and surroundings are highly predictable—this infrastructure dependency is manageable. But for broader ODDs, where environments may change frequently or lack digital maps, achieving safe autonomy becomes much harder. That’s why passenger AVs today generally avoid rural areas, unpaved roads, or newly constructed zones.

Regulatory environments further shape ODDs. In the U.S., states like California, Arizona, and Florida have developed AV testing frameworks, but each differs in what it permits. For instance, California limits fully driverless vehicles to certain urban zones with strict reporting requirements. Delivery robots are often regulated at the city level—some cities allow sidewalk bots, others ban them outright. Transit shuttles often receive special permits for low-speed operation on limited routes. These regulatory boundaries translate directly into ODD constraints.

In terms of physical domains, Ground-based autonomous systems, especially in automotive contexts, are the most commercially visible. Self-driving vehicles operate in human-dense environments, requiring perception systems to identify pedestrians, cyclists, vehicles, and traffic infrastructure. Validation here relies heavily on scenario-based testing, simulation, and controlled pilot deployments. Standards like ISO 26262 (functional safety), ISO/PAS 21448 (SOTIF), and UL 4600 (autonomy system safety) guide safety assurance. Regulatory frameworks are evolving state-by-state or country-by-country, with Operational Design Domain (ODD) restrictions acting as practical constraints on deployment.

Autonomous aircraft (e.g., drones, urban air mobility platforms, and optionally piloted systems) must operate in highly structured, safety-critical environments. Validation involves rigorous formal methods, fault tolerance analysis, and conformance with aviation safety standards such as DO-178C (software), DO-254 (hardware), and emerging guidance like ASTM F38 and EASA's SC-VTOL. Airspace governance is centralized and mature, often requiring type certification and airworthiness approvals. Unlike automotive systems, airborne autonomy must prove reliability in loss-of-link scenarios and demonstrate fail-operational capabilities across flight phases.

Autonomous surface and underwater marine systems face unstructured and communication-constrained environments. They must operate reliably in GPS-denied or RF-blocked conditions while detecting obstacles like buoys, vessels, or underwater terrain. Validation is more empirical, often involving extended sea trials, redundancy in navigation systems, and adaptive mission planning. IMO (International Maritime Organization) and classification societies like DNV are working on Maritime Autonomous Surface Ship (MASS) regulatory frameworks, though global standards are still nascent. The dual-use nature of marine autonomy (civil and defense) adds governance complexity. Space-based autonomous systems (e.g., planetary rovers, autonomous docking spacecraft, and space tugs) operate under extreme constraints: communication delays, radiation exposure, and no real-time human oversight. Validation occurs through rigorous testing on Earth-based analog environments, formal verification of critical software, and fail-safe design principles. Governance falls under national space agencies (e.g., NASA, ESA) and international frameworks like the Outer Space Treaty. Assurance relies on mission-specific autonomy envelopes and pre-defined decision trees rather than reactive autonomy.

Governance also differs. Aviation and space operate within centralized, internationally coordinated regulatory systems (ICAO, FAA, EASA, NASA), while ground autonomy remains highly fragmented across jurisdictions. Maritime governance is progressing but lacks harmonization. Space governance, although anchored in treaties, increasingly contends with commercial activity and national interests, demanding updated risk management protocols.

Emerging efforts like the SAE G-34/SC-21 standard for AI in aviation, NASA's exploration of adaptive autonomy, and ISO’s work on AI functional safety indicate a trend toward domain-agnostic principles for validating intelligent behavior. There is growing recognition that autonomous systems, regardless of environment, need rigorous testing of edge cases, clarity of system intent, and real-time assurance mechanisms.

Autonomy Validation Tools

From ch. 8.0

Validation and verification (V&V) are critical processes in systems engineering and software development that ensure a system meets its intended purpose and functions reliably. Verification is the process of evaluating whether a product, service, or system complies with its specified requirements—essentially asking, “Did we build the system right?” It involves activities such as inspections, simulations, tests, and reviews throughout the development lifecycle. Validation, on the other hand, ensures that the final system fulfills its intended use in the real-world environment—answering the question, “Did we build the right system?” This typically includes user acceptance testing, field trials, and performance assessments under operational conditions. Together, V&V help reduce risks, improve safety and quality, and increase confidence that a system will operate effectively and as expected. In the context of autonomous systems, V&V combines two historical trends. The first from the mechanical systems and the second more recent one from classical digital decision systems. Finally, AI adds further complexity to the testing of the digital decision systems.

For traditional safety-critical systems in automotive, the evolution of V&V has been closely linked to regulatory standards frameworks such as ISO 26262. Key elements of this framework include:

System Design Process: A structured development assurance approach for complex systems, incorporating safety certification within the integrated development process.
Formalization: The formal definition of system operating conditions, functionalities, expected behaviors, risks, and hazards that must be mitigated.
Lifecycle Management: The management of components, systems, and development processes throughout their lifecycle.

The primary objective was to meticulously and formally define the system design, anticipate expected behaviors and potential issues, and comprehend the impact over the product's lifespan.

With the advent of conventional software paradigms, safety-critical V&V adapted by preserving the original system design approach while integrating software as system components. These software components maintained the same overall structure of fault analysis, lifecycle management, and hazard analysis within system design. However, certain aspects required extension. For instance, in the airborne domain, standard DO-178C, which addresses “Software Considerations in Airborne Systems and Equipment Certification,” updated the concept of hazard from physical failure mechanisms to functional defects, acknowledging that software does not degrade due to physical processes. Also revised were lifecycle management concepts, reflecting traditional software development practices. Design Assurance Levels (DALs) were incorporated, allowing the integration of software components into system design, functional allocation, performance specification, and the V&V process, akin to SOTIF in the automotive industry.

Table one above shows the difference between ISO 26262 and SOTIF. In general, the fundamental characteristics of digital software systems are problematic in safety critical systems. However, the IT sector has been a key megatrend which has transformed the world over the last 50 years. In the process, it has developed large ecosystems around semiconductors, operating systems, communications, and application software. At this point, using these ecosystems is critical to nearly every product’s success, so mixed-domain safety critical products are now a reality. Mixed Domain structures can be classified in three broad paradigms each of which have very different V&V requirements: Mechanical Replacement (Big Physical, small Digital), Electronic Adjacent (separate Physical and Digital), autonomy (Big Digital, small Physical). Drive-by-Wire functionality is an example of the mechanical replacement paradigm where the implementation of the original mechanical functionality is done by electronic components (HW/SW). In their initial configurations, these mixed electronic/mechanical systems were physically separated as independent subsystems. In this configuration, the V&V process looked very similar to the traditional mechanical verification process.

The paradigm of separate physical subsystems has the advantage of V&V simplification and safety, but the large disadvantage of component skew and material cost. Thus, an emerging trend has been to build underlying computational fabrics with networking and virtually (through software) separate functionality. From a V&V perspective, this means that the virtual backbone which maintains this separation (ex: RTOS) must be verified to a very high standard. Infotainment systems are an example of Electronics Adjacent integration. Generally, there is an independent IT infrastructure working with the safety critical infrastructure, and from a V&V perspective, they can be validated separately. However, the presence of infotainment systems enables very powerful communication technologies (5G, Bluetooth, etc.) where the cyber-physical system can be impacted by external third parties. From a safety perspective, the simplest method for maintaining safety would be to physically separate these systems. However, this is not typically done because a connection is required to provide “over-the-air” updates to the device. Thus, the V&V capability must again verify the virtual safeguards against malicious intent are robust. Finally, the last level of integration is in the context of autonomy. In autonomy, the processes of sensing, perception, location services, path planning envelope the traditional mechanical functionality.

Moving beyond software, AI has built a “learning” paradigm. In this paradigm, there is a period of training where the AI machine “learns” from data to build its own rules, and in this case, learning is defined on top of traditional optimization algorithms which try to minimize some notion of error. This effectively is data driven software development as shown in figure below. However,there are profound differences between AI software and conventional software. The introduction of AI generated software introduces significant issues to the V&V task as shown in table 2 below.

Ref:

[1] Vargas, J.; Alsweiss, S.; Toker, O.; Razdan, R.; Santos, J. An Overview of Autonomous Vehicles Sensors and Their Vulnerability to Weather Conditions. Sensors 2021, 21, 5397. https://doi.org/10.3390/s21165397

en/safeav/as/vreq.txt · Last modified: 2026/06/02 15:50 by raivo.sell