The list of book contributors is presented below.
Electronics design trends have revolutionized society. The start was with centralized computing led by firms like IBM and DEC. These technologies enhanced productivity for global business operations, significantly impacting finance, HR, and administrative functions, eliminating the need for extensive paperwork. The next wave in economy shaping technologies consisted of edge computing devices (red in Figure below) such as personal computers, cell phones, and tablets. With this capability, companies such as Apple, Amazon, Facebook, Google, and others could add enormous productivity to the advertising and distribution functions for global business. Suddenly, one could directly reach any customer anywhere in the world. This mega-trend has fundamentally disrupted markets such as education (online), retail (ecommerce), entertainment (streaming), commercial real estate (virtualization), health (telemedicine), and more. The next wave of electronics is the dynamic integration of artificial intelligence with physical assets, and apex of this capability is autonomy.
Autonomy research traces its lineage to mid-20th-century cybernetics and control theory, where researchers like Norbert Wiener, Ross Ashby, and early robotics pioneers explored how machines could sense, process feedback, and act purposefully. The 1960s–1980s brought key breakthroughs: Shakey the Robot at SRI demonstrated integrated perception, planning, and action; DARPA’s Autonomous Land Vehicle project pushed early computer vision and navigation; and advances in probabilistic robotics—such as Kalman filtering, Bayesian estimation, and SLAM—formalized how autonomous systems make decisions under uncertainty. During this period, autonomy was largely rule-based and dominated by deterministic control, limited sensing, and narrow computational capabilities.
Modern autonomy began accelerating in the 1990s and 2000s with increased computing power, the rise of machine learning, and large-scale government programs. The DARPA Grand Challenges (2004–2007) marked a turning point, proving that self-driving vehicles could handle complex, unstructured environments and catalyzing both academic and commercial investment. The 2010s saw deep learning revolutionize perception, enabling robust object detection, scene understanding, and end-to-end control. This expanded autonomy from traditional robotics to autonomous systems in the ground, maritime, airborne, and space contexts.
Given the massive amount of research, several books have been written on autonomy. For example, Introduction to Autonomous Robots provides a comprehensive and accessible foundation for designing autonomous systems, covering the essential building blocks such as robot mechanisms, sensing modalities, actuation, perception, localization, mapping, and planning. It is widely used in university courses because it blends theory with practical algorithms, offering clear explanations of how autonomous robots interpret their environment and make decisions. Distributed Autonomous Robotic Systems, by contrast, focuses on the challenges and architectures of multi-robot and swarm systems, exploring decentralized control, coordination, communication, and robustness in distributed environments. Together, these two books span the spectrum from single-robot autonomy to collaborative, multi-agent systems, giving readers a solid grasp of both foundational robotics and the complexities of distributed autonomy.
In contrast to existing literature, this book focuses on the innovations required for a core design to be integrated into the governing systems in society. This process is especially challenging for autonomous systems because they integrate four broad domains which have traditionally not interacted with each other:
The remainder of this book is organized as follows. Chapter 2 provides a high-level introduction to autonomous systems, including the underlying technologies and their interaction with regulatory, safety, and standards environments. Chapter 3 examines hardware architectures, with particular emphasis on sensors, high-performance computing platforms, and emerging challenges in hardware supply chains. Chapter 4 focuses on software architecture, including real-time execution, safety-critical software development, and the growing importance of stable and secure software supply chains. Chapter 5 explores higher-level autonomy algorithms for perception, mapping, and localization, with a focus on system safety and reliability. Chapter 6 addresses planning, control, and decision-making, examining how autonomous systems translate perception into safe and effective action. Finally, Chapter 7 examines communication between autonomous systems, humans, and infrastructure—including human–machine interfaces (HMI) and vehicle-to-everything (V2X) communication—with an emphasis on integrated system safety and operational robustness.
Autonomous systems use sensors (e.g. cameras, radars, ultrasonic sensors) to collect information about the environment. The collected data are processed, and decisions regarding further action are made on their basis. What exactly is autonomy? The autonomy of a system can be defined as its ability to act according to its own goals, norms, internal states, and knowledge, without external human intervention. This means that autonomous systems are not limited to robots or unmanned vehicles. This definition includes any automatic functions that can reduce the level of workload or support the person driving the vehicle.
Autonomous systems use advanced technologies such as artificial intelligence, machine learning, neural networks, Internet of Things, and others to perform tasks independently. Autonomous systems are today's Industry 4.0 and are used in various areas, from robotics, through transport and logistics, to medicine and education. An example would be an autonomous car that makes decisions on its own based on data from sensors, or an autonomous transport vehicle (AGV, or Automated Guided Vehicles) designed to safely and efficiently transport loads in a warehouse, without the need for operator supervision. Another application of autonomous systems are production systems that, based on data from industrial sensors, automatically control production processes, control machines and optimize production. This allows for shortening production times, reducing production costs and increasing product quality. Autonomous systems are also used in transport and logistics, where they enable faster and more efficient delivery of goods. Thanks to the Internet of Things and monitoring systems, every stage of transport can be tracked, from loading to delivery, which allows for better control of the process. Autonomous systems are becoming an increasingly important part of our lives, and their development and application will have an increasing impact on the future.
Autonomous systems operate in fundamentally different physical environments across ground, marine, airborne, and space domains, and these environmental differences strongly influence system design, sensing, safety, and operational architecture. Ground systems operate in highly structured but unpredictable environments with dense obstacles, human interaction, and high-bandwidth connectivity, requiring real-time perception, fast reaction times, and robust human safety assurance. Marine systems operate in less structured but slower-moving three-dimensional environments with fewer obstacles, limited connectivity, and strong environmental disturbances such as waves, currents, and corrosion, placing greater emphasis on long-duration reliability, navigation robustness, and remote supervision. Airborne systems operate in three-dimensional, safety-critical environments governed by strict airspace control, requiring extremely high reliability, precise navigation, fault tolerance, and formal certification due to the severe consequences of failure. Space systems operate in the most extreme and isolated environment, characterized by radiation exposure, vacuum, extreme temperature variation, and long communication delays, making real-time human intervention impossible and requiring systems to be highly autonomous, fault-tolerant, and capable of operating independently for extended periods. As a result, autonomy architectures, safety requirements, sensing modalities, and verification approaches vary significantly across these domains, even though they share common underlying principles of perception, decision-making, and control.
Overall, autonomy is a transformational technology which will drive economic processes which will transform society. In order to be effective, autonomy must integrate with the critical elements of society, and the rest of this chapter will discuss these in more detail.
Intuitively, autonomy of unmanned systems refers to their ability to self-manage, make decisions, and complete tasks with minimal or no human intervention. To collaborate with other systems or humans, autonomy requires a clear system definition. This definition not only communicates function to partners and users, but also sets an expectation function. Expectation functions are central to many technical (validation), governance (licensing), and legal (liability) processes. Each of the physical domains have built somewhat similar “levels” of autonomy which start setting expectation functions.
For ground vehicles, in 2014, the American organization Society of Automotive Engineers (SAE) International adopted a classification of six levels of autonomous driving, which was subsequently modified in 2016. Based on a decision by the National Highway Traffic Safety Administration (NHTSA), this is the officially applicable standardization in the United States, which is also the most popular in studies on autonomous driving technologies in Europe.
To clarify the situation, SAE International has defined 5 levels of automation for autonomous vehicles, which have been adopted as an industry standard (see Figure 1).
Today, these levels have become the shorthand to communicate expectations and the object of regulatory and legal battles.
In general, autonomy or autonomous capability is defined in the context of decision-making or self-governance within a system. According to the Aerospace Technology Institute (ATI), autonomous systems can essentially decide independently how to achieve mission objectives, without human intervention 2). These systems are also capable of learning and adapting to changing operating environment conditions. However, autonomy may depend on the design, functions, and specifics of the mission or system 3). Autonomy can be broadly viewed as a spectrum of capabilities, from zero autonomy to full autonomy. The Pilot Authorization and Task Control (PACT) model assigns authorization levels, from level 0 (full pilot authority) to level 5 (full system autonomy), also used in the automotive industry for autonomous vehicles (see Figure 2).
Levels of autonomy in drone technology are typically divided into five distinct levels, each representing a gradual increase in the drone's ability to operate independently.
Another general but useful model describing autonomy levels in unmanned systems is the Autonomy Levels for Unmanned Systems (ALFUS) model 5). European Union Aviation Safety Agency (EASA), in one of its technical reports, provided some information on autonomy levels and guidelines for human-autonomy interactions. According to EASA, the concept of autonomy, its levels, and human-autonomous system interactions are not established and remain actively discussed in various areas (including aviation), as there is currently no common understanding of these terms 6). Since these concepts are still somewhat developmental, this becomes a huge challenge for the unmanned aircraft regulatory environment as they remain largely unestablished.
The classification of autonomy levels in multi-drone systems is somewhat different. In multi-drone systems, several drones cooperate to perform a specific task. Designing multi-drone systems requires that individual drones have an increased level of autonomy. The classification of autonomy levels is directly related to the division into flights performed within the pilot's or observer's line of sight (VLOS) and flights performed beyond the pilot's line of sight (BVLOS), where particular attention is paid to flight safety. One way to address the autonomy issue is to classify the autonomy of drones and multi-drone systems into levels related to the hierarchy of tasks performed 7). These levels will have standard definitions and protocols that will guide technology development and regulatory oversight. For single-drone autonomy models, two distinct levels are proposed: the vehicle control layer (Level 1) and the mission control layer (Level 2), see Figure 4. Multi-drone systems, on the other hand, have three levels: single-vehicle control (Level 1), multi-vehicle control (Level 2), and mission control (Level 3). In this hierarchical structure, Level 3 has the lowest priority and can be overridden by Levels 2 or 1.
For marine systems, the International Maritime Organization (IMO) defines autonomy through its Maritime Autonomous Surface Ship (MASS) framework, which describes four progressive levels of autonomy based on the degree of human involvement and onboard decision-making capability. At lower levels, ships use automation primarily to assist human crews with navigation, propulsion, and safety monitoring, while humans remain onboard and responsible for operational decisions. Intermediate levels allow remote operation, where ships may operate without onboard crew but are supervised and controlled from shore-based control centers. At the highest level, fully autonomous vessels can perceive their environment, make navigation and mission decisions independently, and execute those decisions without human intervention. This framework reflects the operational realities of maritime missions, where long durations, predictable dynamics, and remote monitoring make gradual progression toward autonomy feasible.
In space systems, autonomy is commonly described using NASA’s Autonomy Levels for Unmanned Systems (ALFUS) framework, which evaluates autonomy based on the system’s independence from human control, its ability to handle environmental complexity, and its capacity to accomplish mission objectives without intervention. At lower levels, spacecraft rely heavily on ground operators for command and control, executing predefined instructions with minimal onboard decision-making. As autonomy increases, spacecraft gain the ability to perform functions such as fault detection and recovery, autonomous navigation, and adaptive mission planning. At the highest levels, systems can independently perceive their environment, evaluate mission goals, and dynamically adjust their behavior to achieve objectives without real-time human guidance. This progression is particularly important in deep-space missions, where communication delays make continuous human control impractical.
Why marine and space autonomy frameworks differ from ground autonomy:
Marine and space autonomy frameworks differ fundamentally from ground autonomy because their operational constraints emphasize endurance, remote operation, and system resilience rather than continuous interaction with humans in dense, unpredictable environments. Ground vehicles must operate safely in close proximity to human drivers, pedestrians, and complex infrastructure, requiring highly responsive real-time perception and decision-making. In contrast, marine systems operate in relatively structured environments with fewer immediate hazards, allowing autonomy to focus more on navigation efficiency and remote supervision. Space systems present even greater challenges, including extreme communication latency, harsh environmental conditions, and the impossibility of real-time human intervention, requiring spacecraft to autonomously detect faults, maintain operational health, and ensure mission survival. As a result, autonomy in marine and space systems is driven more by operational independence and mission continuity than by immediate human safety interactions. The table below provides a summary of all four domains.
| Unified Level | Ground (SAE J3016) | Airborne (NASA / UAV / DoD) | Marine (IMO MASS / DNV) | Space (NASA ALFUS) | Description |
|---|---|---|---|---|---|
| Level 0 | Level 0 – No automation | Manual flight | AL 0 – Manual ship | ALFUS 0 – Manual | Human performs all sensing, planning, and control |
| Level 1 | Level 1 – Driver assistance | Basic autopilot (e.g., altitude hold, heading hold) | MASS 1 – Decision support | ALFUS 1 – Teleoperation assist | Automation assists human but does not replace decision-making |
| Level 2 | Level 2 – Partial automation | Automated flight execution with supervision | MASS 2 – Remotely controlled with crew onboard | ALFUS 2 – Automated execution | System performs control functions but human supervises continuously |
| Level 3 | Level 3 – Conditional automation | Supervisory autonomy | MASS 3 – Remotely controlled without crew | ALFUS 3 – Supervisory autonomy | System performs mission tasks but human intervenes when needed |
| Level 4 | Level 4 – High automation | High autonomy UAV | MASS 4 – Fully autonomous ship | ALFUS 4–5 – High autonomy spacecraft | System operates independently in defined environments |
| Level 5 | Level 5 – Full automation | Fully autonomous UAV | Fully autonomous ship (advanced DNV AL 4+) | ALFUS 6 – Full autonomy | System operates independently in all environments |
The classification of autonomy into structured levels is not merely a technical taxonomy; it serves as a foundational construct for legal responsibility, regulatory approval, and ethical governance. These autonomy levels define an expectation function, which specifies who (human or machine) is responsible for sensing, decision-making, and action execution under defined operational conditions. This expectation function becomes the basis for certification, validation, liability assignment, and operational authorization which we will discuss in the next section.
In society, products operate within the confines of a legal governance structure. The legal governance structure is one of the great inventions of civilization and its primary role is to funnel disputes from unstructured expression and perhaps even violence to the domain of courts (figure 1). To be effective, legal governance structures must be perceived as fair and predictable. The objective of fairness is obtained by a number of methods such as due process procedures, transparency and public proceedings, and Neutral decision-makers (judges, juries, arbitrators). The objective of predictability is achieved by the use of the concept of precedence. Precedence is the idea that past rulings are given heavier weight relative to decision making, and it is an extraordinary event to diverge from precedence. Precedence gives the legal system stability. The combination of fairness and predictability shifts the dispensation of disputes to a more orderly process which promotes societal stability.
How does this mechanically work and how does this connect to product development ?
As shown in figure 2, there are three major stages. First, legal frameworks are established by law-making bodies (legislators). However, in practice, legislators cannot specify all aspects and empower administrative entities (regulators) to codify the details of law. Finally, regulators often do not have the technical knowledge to codify all aspects of the law and rely on independent industry groups such as Society for Automotive Engineering (SAE) or Institute of Electrical and Electronics Engineers (IEEE) for technical knowledge. Second, in the field, disputes arise and must be adjudicated by the legal system. The typical process is a trial, under the strict processes established for fairness. The result of the trial is to apply the facts to the legal frameworks and apply a judgement. The facts of the case can result in three potential outcomes. In the first situation, the facts are covered by the legal framework, so there is no further action relative to the governance structure. In the second case, the facts expose an “edge” condition in the governance structure. In this situation, the court looks for previous cases which might fit (the concept of precedence) and uses that to make its judgement. If such a case does not exist, the court can establish precedence with its judgement in this case. This has the effect of weighing the future decisions as well. Finally, in rare situations, the facts of the case are in a field which is so new that there is not much in the way of body of law. In these situation, the courts may make a judgement, but often there is a call for law-making bodies to establish deeper legal frameworks.
In fact, autonomous vehicles (AVs) are considered to be one of these situations. Why ? In traditional automobiles, the body of law connected to product liability is connected to the car, and the liability of actions using the car is connected to the driver. Further, Product liability is often managed at the federal level and driver licensing more locally. However, surprisingly, as the figure below shows, there is a body of law dealing with autonomous vehicles from the distant past. In the days of horses, there were accidents, and a sophisticated liability structure emerged. In this structure, there was a concept that if a person directed his horse into an accident, then the driver was at fault. However, if a bystander did something to “spook” the horse, it was the bystander's fault. Finally, there was also the concept of “no-fault” when a horse unexpectedly went rogue. A discerning reader may well understand that this body of law emerges from a deep understanding of the characteristics of a horse. In legal terms, it creates an “expectation.' What are the “expectations” for a modern autonomous vehicle ? This is currently a highly debated point in the industry.
Overall, whatever value products provide to their consumers is weighed against the potential harm caused by the product, and leads to the concept of legal product liability. While laws diverge across various geographies, the fundamental tenets have key elements of expectation and harm. Expectation as judged by “reasonable behavior given a totality of the facts” attaches liability. As an example, the clear expectation is that if you stand in front of a train, it cannot stop instantly while this is not the expectation for most autonomous driving situations. Harm is another key concept where AI recommendation systems for movies are not held to the same standards as autonomous vehicles. The governance framework for liability is mechanically developed through legislative actions and associated regulations. The framework is tested in the court system under the particular circumstances or facts of the case. To provide stability to the system, the database of cases and decisions are viewed as a whole under the concept of precedence. Clarification on legal points is set by the appellate legal system where arguments on the application of the law are decided what sets precedence.
What is an example of this whole situation ? Consider the airborne space with the figure above where the governance framework consists of enacted law (in this case US) with associated cases providing legal precedence, regulations, and industry standards. Any product in the airborne sector, must be compliant to release their solution to the marketplace.
Ref:
As discussed in the governance module, whatever value products provide to their consumers is weighed against the potential harm caused by the product, and leads to the concept of legal product liability. From a product development perspective, the combination of laws, regulations, legal precedence form the overriding governance framework around which the system specification must be constructed [3]. The process of validation ensures that a product design meets the user's needs and requirements, and verification ensures that the product is built correctly according to design specifications.
Fig. 1. V&V and Governance Framework. The Master V&V(MaVV) process needs to demonstrate that the product has been reasonably tested given the reasonable expectation of causing harm. It does so using three important concepts [4]:
As figure 1 shows, the Verification & Validation (V&V) process is the key input into the governance structure which attaches liability, and per the governance structure, each of the elements must show “reasonable due diligence.” An example of unreasonable ODD would be for an autonomous vehicle to give up control a millisecond before an accident.
Mechanically, MaVV is implemented with a Minor V&V (MiVV) process consisting of:
In practice, each of these steps can have quite a bit of complexity and associated cost. Since the ODD can be a very wide state space, intelligently and efficiently generating the stimulus is critical. Typically, in the beginning, stimulus generation is done manually, but this quickly fails the efficiency test in terms of scaling. In virtual execution environments, pseudo-random directed methods are used to accelerate this process. In limited situations, symbolic or formal methods can be used to mathematically carry large state spaces through the whole design execution phase. Symbolic methods have the advantage of completeness but face algorithmic computational explosion issues as many of the operations are NP-Complete algorithms.
The execution stage can be done physically (such as test track above), but this process is expensive, slow, has limited controllability and observability, and in safety critical situations, potentially dangerous. In contrast, virtual methods have the advantage of cost, speed, ultimate controllability and observability, and no safety issues. The virtual methods also have the great advantage of performing the V&V task well before the physical product is constructed. This leads to the classic V chart shown in figure 1. However, since virtual methods are a model of reality, they introduce inaccuracy into the testing domain while physical methods are accurate by definition. Finally, one can intermix virtual and physical methods with concepts such as Software-in-loop or Hardware-in-loop. The observable results of the stimulus generation are captured to determine correctness. Correctness is typically defined by either a golden model or an anti-model. The golden model, typically virtual, offers an independently verified model whose results can be compared to the product under test. Even in this situation, there is typically a divergence between the abstraction level of the golden model and the product which must be managed. Golden model methods are often used in computer architectures (ex ARM, RISCV). The anti-model situation consists of error states which the product cannot enter, and thus the correct behavior is the state space outside of the error states. An example might be in the autonomous vehicle space where an error state might be an accident or violation of any number of other constraints. The MaVV consists of building a database of the various explorations of the ODD state space, and from that building an argument for completeness. The argument typically takes the nature of a probabilistic analysis. After the product is in the field, field returns are diagnosed, and one must always ask the question: Why did not my original process catch this issue? Once found, the test methodology is updated to prevent issues with fixes going forward. The V&V process is critical in building a product which meets customer expectations and documents the need for “reasonable” due diligence for the purposes of product liability in the governance framework.
In most cases, the generic V&V process must grapple with massive ODD spaces, limited execution capacity, and high cost of evaluation. Further, all of this must be done in a timely manner to make the product available to the marketplace. Traditionally, the V&V regimes have been bifurcated into two broad categories: Physics- Based and Decision-Based. We will discuss the key characteristics of each now.
For MaVV, the critical factors are the efficiency of the MiVV “engine” and the argument for the completeness of the validation. Historically, mechanical/non-digital products (such as cars or airplanes) required sophisticated V&V. These systems were examples of a broader class of products which had a Physics-Based Execution (PBE) paradigm. In this paradigm, the underlying model execution (including real life) has the characteristics of continuity and monotonicity because the model operates in the world of physics. This key insight has enormous implications for V&V because it greatly constrains the potential state-space to be explored. Examples of this reduction of state-space include:
- Scenario Generation: One needs only worry about the state space constrained by the laws of physics. Thus, objects which obey physics cannot exist. Every actor is explicitly constrained by the laws of physics.
Critically, all the speed bins below this critical speed are safe and do not have to be explored. Mechanically, in traditional PBE fields, the philosophy of safety regulation (ISO 26262 [5], AS9100 [6], etc.) builds the safety framework as a process, where
Traditionally, faults considered are primarily mechanical failure. As an example, the flow for validating the braking system in an automobile through ISO 26262 would have the following steps:
Finally, the regulations have a strong idea of safety levels with Automotive Safety Integrity Levels (ASIL). Airborne systems follow a similar trajectory (pun intended) with the concept of Design Assurance Levels (DALs). A key part of the V&V task is to meet the standards required at each ASIL level. Historically, a sophisticated set of V&V techniques has been developed to verify traditional automotive systems. These techniques included well-structured physical tests, often validated by regulators, or sanctioned independent companies (ex TUV-Sud [7]). Over the years, the use of virtual physics-based models has increased to model design tasks such as body design [8] or tire performance [9]. The general structure of these models is to build a simulation which is predictive of the underlying physics to enable broader ODD exploration. This creates a very important characterization, model generation, predictive execution, and correction flow. Finally, because the execution is highly constrained by physics, virtual simulators can have limited performance and often require extensive hardware support for simulation acceleration. In summary, the key underpinnings of the PBE paradigm from a V&V point of view are:
As cyber-physical systems evolved, information technology (IT) rapidly transformed the world.
Fig. 4. Progression of System Specification (HW, SW, AI).
As shown in Figure 4, within electronics, there has been a progression of system function construction where the first stage was hardware or pseudo-hardware (FPGA, microcode). The next stage involved the invention of a processor architecture upon which software could imprint system function. Software was a design artifact written by humans in standard languages (C, Python, etc.). The revolutionary aspect of the processor abstraction allowed a shift in function without the need to shift physical assets. However, one needed legions of programmers to build the software. Today, the big breakthrough with Artificial Intelligence (AI) is the ability to build software with the combination of underlying models, data, and metrics.
In their basic form, IT systems were not safety critical, and the similar levels of legal liability have not attached to IT products. However, the size and growth of IT is such that problems in large volume consumer products can have catastrophic economic consequences [10]. Thus, the V&V function was very important. IT systems follow the same generic processes for V&V as outlined above, but with two significant differences around the execution paradigm and source of errors. First, unlike the PBE paradigm, the execution paradigm of IT follows a Decision Based Execution mode (DBE). That is, there are no natural constraints on the functional behavior of the underlying model, and no inherent properties of monotonicity. Thus, the whole massive ODD space must be explored which makes the job of generating tests and demonstrating coverage extremely difficult. To counter this difficulty, a series of processes have been developed to build a more robust V&V structure. These include: 1) Code Coverage: Here, the structural specification of the virtual model is used as a constraint to help drive the test generation process. This is done with software or hardware (RTL code). 2) Structured Testing: A process of component, subsection, and integration testing has been developed to minimize propagation of errors. 3) Design Reviews: Structured design reviews with specs and core are considered best practice.
A good example of this process flow is the CMU Capability Maturity Model Integration (CMMI) [11] which defines a set of processes to deliver quality software. Large parts of the CMMI architecture can be used for AI when AI is replacing existing SW components. Finally, testing in the DBE domain decomposes into the following philosophical categories: “Known knowns:” Bugs or issues that are identified and understood, “Known unknowns” Potential risks or issues that are anticipated but whose exact nature or cause is unclear, and “Unknown unknowns” Completely unanticipated issues that emerge without warning, often highlighting gaps in design, understanding, or testing. The last category being the most problematic and most significant for DBE V&V. Pseudo-random test generation has been a key technique used as a method to expose this category [12]. In summary, the key underpinnings of the DBE paradigm from a V&V point of view are: 1) Unconstrained and not well-behaved execution space for scenario test generation, 2) Generally, less expensive simulation execution (no physical laws to simulate), 3) V&V focused on logical errors not mechanical failure 4) Generally, no defined regulatory process for safety critical applications. Most software is “best efforts,” 5) “Unknown-unknowns” a key focus of validation.
A key implication of the DBE space is that the idea from the PBE world of building a list of faults and building a safety argument for them is antithetical to the focus of DBE validation.
Finally, the product development process is typically focused on defining an ODD and validating against that situation. However, in modern times, an additional concern is that of adversarial attacks (cybersecurity). In this situation, an adversary wants to high jack the system for nefarious intent. In this situation, the product owner must not only validate against the ODD, but also detect when the system is operating outside the ODD. After detection, the best case scenario is to safely redirect the system to the ODD space. The risk associated with cybersecurity issues typically split at three levels for cyber-physical systems:
In terms of governance, some reasonable due-diligence is expected to be provided by the product developer in order to minimize these issues. The level of validation required is dynamic in nature and connected to the norm in the industry.
[pczekalski]Introduce cybersecurity chapter - add V&V relation for cybersecurity
In terms of domains, the Operational Design Domain (ODD) is the driving factor, and typically has two dimensions. The first is the operational model, and the second is the physical domain (ground, airborne, marine, space). On the ground, Passenger AVs are perhaps the most well-known face of autonomy, with robo-taxi services and self-driving consumer vehicles gradually entering urban environments. Companies like Waymo, Cruise, and Tesla have taken different approaches to ODDs. Waymo’s fully driverless cars operate in sunny, geo-fenced suburbs of Phoenix with detailed mapping and remote supervision. Cruise began service in San Francisco, originally operating only at night to reduce complexity. Tesla’s Full Self Driving (FSD) Beta aims for broader generalization, but it still relies heavily on driver supervision and is limited by weather and visibility challenges.
Transit shuttles, though less publicized, have quietly become a practical application of AVs in controlled environments. These low-speed vehicles typically operate in geo-fenced areas such as university campuses, airports, or business parks. Companies like Navya, Beep, and EasyMile deploy shuttles that follow fixed routes and schedules, interacting minimally with complex traffic scenarios. Their ODDs are tightly defined: they may not operate in rain or snow, often run only during daylight, and avoid high-speed or mixed-traffic conditions. In many cases, a remote operator monitors operations or is available to intervene if needed. Delivery robots represent a third class of autonomous mobility—compact, lightweight vehicles designed for last-mile delivery. Their ODDs are perhaps the narrowest, but that’s by design. These robots, from companies like Starship, Kiwibot, and Nuro, navigate sidewalks, crosswalks, and short street segments in suburban or campus environments. They operate at pedestrian speeds (typically under 10 mph), carry small payloads, and avoid extreme weather, high traffic, or unstructured terrain. Because they don’t carry passengers, safety thresholds and regulatory oversight can differ significantly.
Weather is a particularly limiting factor across all autonomous systems. Rain, snow, fog, and glare interfere with LIDAR, radar, and camera performance—especially for smaller robots that operate close to the ground. Most AV deployments today restrict operations to fair-weather conditions. This is especially true for delivery robots and transit shuttles, which often halt operations during storms. While advanced sensor fusion and predictive modeling promise improvements, true all-weather autonomy remains a significant technical challenge. The intersection of weather and autonomy is an active research area [1]
Another ODD dimension is time of day. Nighttime operation brings unique difficulties for AVs: reduced visibility, increased pedestrian unpredictability, and in urban areas, more erratic driver behavior. Some systems (like Waymo in Chandler, AZ) now operate 24/7, but most deployments—particularly delivery robots and shuttles—remain restricted to daylight hours. Tesla's FSD does operate at night, but it still requires human oversight. Infrastructure also shapes ODDs in crucial ways. Many AV systems depend on high-definition maps, lane-level GPS, and even smart traffic signals to guide their decisions. In geo-fenced environments—where the route and surroundings are highly predictable—this infrastructure dependency is manageable. But for broader ODDs, where environments may change frequently or lack digital maps, achieving safe autonomy becomes much harder. That’s why passenger AVs today generally avoid rural areas, unpaved roads, or newly constructed zones.
Regulatory environments further shape ODDs. In the U.S., states like California, Arizona, and Florida have developed AV testing frameworks, but each differs in what it permits. For instance, California limits fully driverless vehicles to certain urban zones with strict reporting requirements. Delivery robots are often regulated at the city level—some cities allow sidewalk bots, others ban them outright. Transit shuttles often receive special permits for low-speed operation on limited routes. These regulatory boundaries translate directly into ODD constraints.
In terms of physical domains, Ground-based autonomous systems, especially in automotive contexts, are the most commercially visible. Self-driving vehicles operate in human-dense environments, requiring perception systems to identify pedestrians, cyclists, vehicles, and traffic infrastructure. Validation here relies heavily on scenario-based testing, simulation, and controlled pilot deployments. Standards like ISO 26262 (functional safety), ISO/PAS 21448 (SOTIF), and UL 4600 (autonomy system safety) guide safety assurance. Regulatory frameworks are evolving state-by-state or country-by-country, with Operational Design Domain (ODD) restrictions acting as practical constraints on deployment.
Autonomous aircraft (e.g., drones, urban air mobility platforms, and optionally piloted systems) must operate in highly structured, safety-critical environments. Validation involves rigorous formal methods, fault tolerance analysis, and conformance with aviation safety standards such as DO-178C (software), DO-254 (hardware), and emerging guidance like ASTM F38 and EASA's SC-VTOL. Airspace governance is centralized and mature, often requiring type certification and airworthiness approvals. Unlike automotive systems, airborne autonomy must prove reliability in loss-of-link scenarios and demonstrate fail-operational capabilities across flight phases.
Autonomous surface and underwater marine systems face unstructured and communication-constrained environments. They must operate reliably in GPS-denied or RF-blocked conditions while detecting obstacles like buoys, vessels, or underwater terrain. Validation is more empirical, often involving extended sea trials, redundancy in navigation systems, and adaptive mission planning. IMO (International Maritime Organization) and classification societies like DNV are working on Maritime Autonomous Surface Ship (MASS) regulatory frameworks, though global standards are still nascent. The dual-use nature of marine autonomy (civil and defense) adds governance complexity. Space-based autonomous systems (e.g., planetary rovers, autonomous docking spacecraft, and space tugs) operate under extreme constraints: communication delays, radiation exposure, and no real-time human oversight. Validation occurs through rigorous testing on Earth-based analog environments, formal verification of critical software, and fail-safe design principles. Governance falls under national space agencies (e.g., NASA, ESA) and international frameworks like the Outer Space Treaty. Assurance relies on mission-specific autonomy envelopes and pre-defined decision trees rather than reactive autonomy.
Governance also differs. Aviation and space operate within centralized, internationally coordinated regulatory systems (ICAO, FAA, EASA, NASA), while ground autonomy remains highly fragmented across jurisdictions. Maritime governance is progressing but lacks harmonization. Space governance, although anchored in treaties, increasingly contends with commercial activity and national interests, demanding updated risk management protocols.
Emerging efforts like the SAE G-34/SC-21 standard for AI in aviation, NASA's exploration of adaptive autonomy, and ISO’s work on AI functional safety indicate a trend toward domain-agnostic principles for validating intelligent behavior. There is growing recognition that autonomous systems, regardless of environment, need rigorous testing of edge cases, clarity of system intent, and real-time assurance mechanisms.
Validation and verification (V&V) are critical processes in systems engineering and software development that ensure a system meets its intended purpose and functions reliably. Verification is the process of evaluating whether a product, service, or system complies with its specified requirements—essentially asking, “Did we build the system right?” It involves activities such as inspections, simulations, tests, and reviews throughout the development lifecycle. Validation, on the other hand, ensures that the final system fulfills its intended use in the real-world environment—answering the question, “Did we build the right system?” This typically includes user acceptance testing, field trials, and performance assessments under operational conditions. Together, V&V help reduce risks, improve safety and quality, and increase confidence that a system will operate effectively and as expected. In the context of autonomous systems, V&V combines two historical trends. The first from the mechanical systems and the second more recent one from classical digital decision systems. Finally, AI adds further complexity to the testing of the digital decision systems.
For traditional safety-critical systems in automotive, the evolution of V&V has been closely linked to regulatory standards frameworks such as ISO 26262. Key elements of this framework include:
The primary objective was to meticulously and formally define the system design, anticipate expected behaviors and potential issues, and comprehend the impact over the product's lifespan.
With the advent of conventional software paradigms, safety-critical V&V adapted by preserving the original system design approach while integrating software as system components. These software components maintained the same overall structure of fault analysis, lifecycle management, and hazard analysis within system design. However, certain aspects required extension. For instance, in the airborne domain, standard DO-178C, which addresses “Software Considerations in Airborne Systems and Equipment Certification,” updated the concept of hazard from physical failure mechanisms to functional defects, acknowledging that software does not degrade due to physical processes. Also revised were lifecycle management concepts, reflecting traditional software development practices. Design Assurance Levels (DALs) were incorporated, allowing the integration of software components into system design, functional allocation, performance specification, and the V&V process, akin to SOTIF in the automotive industry.
Table one above shows the difference between ISO 26262 and SOTIF. In general, the fundamental characteristics of digital software systems are problematic in safety critical systems. However, the IT sector has been a key megatrend which has transformed the world over the last 50 years. In the process, it has developed large ecosystems around semiconductors, operating systems, communications, and application software. At this point, using these ecosystems is critical to nearly every product’s success, so mixed-domain safety critical products are now a reality. Mixed Domain structures can be classified in three broad paradigms each of which have very different V&V requirements: Mechanical Replacement (Big Physical, small Digital), Electronic Adjacent (separate Physical and Digital), autonomy (Big Digital, small Physical). Drive-by-Wire functionality is an example of the mechanical replacement paradigm where the implementation of the original mechanical functionality is done by electronic components (HW/SW). In their initial configurations, these mixed electronic/mechanical systems were physically separated as independent subsystems. In this configuration, the V&V process looked very similar to the traditional mechanical verification process.
The paradigm of separate physical subsystems has the advantage of V&V simplification and safety, but the large disadvantage of component skew and material cost. Thus, an emerging trend has been to build underlying computational fabrics with networking and virtually (through software) separate functionality. From a V&V perspective, this means that the virtual backbone which maintains this separation (ex: RTOS) must be verified to a very high standard. Infotainment systems are an example of Electronics Adjacent integration. Generally, there is an independent IT infrastructure working with the safety critical infrastructure, and from a V&V perspective, they can be validated separately. However, the presence of infotainment systems enables very powerful communication technologies (5G, Bluetooth, etc.) where the cyber-physical system can be impacted by external third parties. From a safety perspective, the simplest method for maintaining safety would be to physically separate these systems. However, this is not typically done because a connection is required to provide “over-the-air” updates to the device. Thus, the V&V capability must again verify the virtual safeguards against malicious intent are robust. Finally, the last level of integration is in the context of autonomy. In autonomy, the processes of sensing, perception, location services, path planning envelope the traditional mechanical functionality.
Moving beyond software, AI has built a “learning” paradigm. In this paradigm, there is a period of training where the AI machine “learns” from data to build its own rules, and in this case, learning is defined on top of traditional optimization algorithms which try to minimize some notion of error. This effectively is data driven software development as shown in figure below. However,there are profound differences between AI software and conventional software. The introduction of AI generated software introduces significant issues to the V&V task as shown in table 2 below.
Ref:
[1] Vargas, J.; Alsweiss, S.; Toker, O.; Razdan, R.; Santos, J. An Overview of Autonomous Vehicles Sensors and Their Vulnerability to Weather Conditions. Sensors 2021, 21, 5397. https://doi.org/10.3390/s21165397
— MISSING PAGE —
This chapter has provided an overviews of autonomous systems (ground, airborne, marine, space), the initial framing of expectation functions for autonomy, the governance structures into which autonomy must operate, an overview of the validation and verification mechanisms used to support these governance structures, and finally an overview of autonomy in each of the physical domains.
Autonomy is part of the next big megatrend in electronics which is likely to change society. As a new technology, there are a large number of open research problems. These problems can be classified in four broad categories: Autonomy hardware, Autonomy Software, Autonomy Ecosystem, and Autonomy Business models. In terms of hardware, autonomy consists of a mobility component (increasingly becoming electric), sensors, and computation.
Research in sensors for autonomy is rapidly evolving, with a strong focus on “sensor fusion, robustness, and intelligent perception.” One exciting area is “multi-modal sensor fusion,” where data from LiDAR, radar, cameras, and inertial sensors are combined using AI to improve perception in complex or degraded environments. Researchers are developing uncertainty-aware fusion models that not only integrate data but also quantify confidence levels, essential for safety-critical systems. There's also growing interest in “event-based cameras” and “adaptive LiDAR,” which offer low-latency or selective scanning capabilities for dynamic scenes, while self-supervised learning enables autonomous systems to extract semantic understanding from raw, unlabeled sensor data. Another critical thrust is the development of resilient and context-aware sensors. This includes sensors that function in all-weather conditions, such as “FMCW radar” and “polarization-based vision,” and systems that can detect and correct for sensor faults or spoofing in real-time. Researchers are also exploring “terrain-aware sensing,” “semantic mapping,” and “infrastructure-to-vehicle (I2V)” sensor networks to extend situational awareness beyond line-of-sight. Finally, sensor co-design—where hardware, placement, and algorithms are optimized together—is gaining traction, especially in “edge computing architectures” where real-time processing and low power are crucial. These advances support autonomy not just in cars, but also in drones, underwater vehicles, and robotic systems operating in unstructured or GPS-denied environments.
In terms of computation, exciting research focuses on enabling real-time decision-making in environments where cloud connectivity is limited, latency is critical, and power is constrained. One prominent area is the “co-design of perception and control algorithms with edge hardware,” such as integrating neural network compression, quantization, and pruning techniques to run advanced AI models on embedded systems (e.g., NVIDIA Jetson, Qualcomm RB5, or custom ASICs). Research also targets “dynamic workload scheduling,” where sensor processing, localization, and planning are intelligently distributed across CPUs, GPUs, and dedicated accelerators based on latency and energy constraints. Another major focus is on “adaptive, context-aware computing,” where the system dynamically changes its computational load or sensing fidelity based on situational awareness—for instance, increasing compute resources during complex maneuvers or reducing them during idle cruising. Related to this is “event-driven computing” and “neuromorphic architectures” that mimic biological efficiency to reduce energy use in perception tasks. Researchers are also exploring “secure edge execution,” such as trusted computing environments and runtime monitoring to ensure deterministic behavior under adversarial conditions. Finally, “collaborative edge networks,” where multiple autonomous agents (vehicles, drones, or infrastructure nodes) share compute and data at the edge in real time, open new frontiers in swarm autonomy and decentralized intelligence.
Finally, as there is a shift towards “software defined vehicles,” there is an increasing need to develop computing hardware architectures bottom-up with critical properties of software reuse and underlying hardware innovation. This process mimics computer architectures in information technology, but does not exist in the world of autonomy today.
In terms of software, important system functions such as perception, path planning, and location services sit in software/AI layer. While somewhat effective, AV stacks are quite a bit less effective then a human who can navigate the world spending only about a 100 watts of power. There are a number of places where humans/machine autonomy differ. These include:
Thus, in addition to traditional machine learning techniques, newer AI architectures with properties of robustness, power/compute efficiency, and effectiveness are open research problems.
In terms of Ecosystem, key open research problems exist in areas such as safety validation, V2X communication, and ecosystem partners.
Verification and validation (V\&V) for autonomous systems is evolving rapidly, with key research focused on making AI-driven behavior both “provably safe and explainable.” One major direction involves “bounding AI behavior” using formal methods and developing “explainable AI” (XAI) that supports safety arguments regulators and engineers can trust. Researchers is also focused on “rare and edge-case scenario generation” through adversarial learning, simulation, and digital twins, aiming to create test cases that challenge the limits of perception and planning systems. Defining new “coverage metrics”—such as semantic or risk-based coverage—has become crucial, as traditional code coverage doesn’t capture the complexity of non-deterministic AI components. Another active area is “scalable system-level V&V,” where component-level validation must support higher-level safety guarantees. This includes “compositional reasoning,” contracts-based design, and model-based safety case automation. The integration of digital twins for closed-loop simulation and real-time monitoring is enabling continuous validation even post-deployment. In parallel, “cybersecurity-aware V&V” is emerging, focusing on spoofing resilience and securing the validation pipeline itself. Finally, standardization of simulation formats (e.g., OpenSCENARIO, ASAM) and the rise of “test infrastructure-as-code” are laying the groundwork for scalable, certifiable autonomy, especially under evolving regulatory frameworks like UL 4600 and ISO 21448.
One of the ecosystem aids to autonomy maybe connection to the infrastructure and of course, in mixed human/machine environments there is the natural Human Machine Interface (HMI). Key research in V2X (Vehicle-to-Everything) for autonomy centers on enabling cooperative behavior and enhanced situational awareness through low-latency, secure communication. A major area of focus is on “reliable, high-speed communication” via technologies like “C-V2X and 5G/6G,” which are critical for supporting time-sensitive autonomous functions such as coordinated lane changes, intersection management, and emergency response. Closely linked is the development of “edge computing architectures,” where V2X messages are processed locally to reduce latency and support real-time decision-making. Research is active in “cooperative perception,” where vehicles and infrastructure share sensor data to extend the field of view beyond occlusions, enabling safer navigation in complex urban environments. Another core research direction is the integration of “smart infrastructure and digital twins,” where roadside sensors provide real-time updates to HD maps and augment vehicle perception. This is essential for detecting dynamic road conditions, construction zones, and temporary signage. In parallel, ensuring “security and privacy in V2X communication” is a growing concern. Work is underway on encrypted, authenticated protocols and on methods to detect and respond to malicious actors or faulty data. Finally, standardization and interoperability are vital for large-scale deployment; efforts are focused on harmonizing communication protocols across vendors and regions and on developing robust, scenario-based testing frameworks that incorporate both simulation and physical validation. Finally, an open research issue is the tradeoff between individual autonomy and dependence on an infrastructure. Associated with infrastructure dependence are open issues of legal liability, business model, or cost.
Human-Machine Interface (HMI) for autonomy remains an area with several open research and design challenges, particularly around trust, control, and situational awareness. One major issue is how to build “appropriate trust and transparency” between users and autonomous systems. Current interfaces often fail to clearly convey the vehicle’s capabilities, limitations, or decision-making rationale, which can lead to overreliance or confusion. There's a delicate balance between providing sufficient information to promote understanding and avoiding cognitive overload. Additionally, ensuring “safe and intuitive transitions of control,” especially in Level 3 and Level 4 autonomy, remains a critical concern. Drivers may take several seconds to re-engage during a takeover request, and the timing, modality, and clarity of such prompts are not yet standardized or optimized across systems. Another set of challenges lies in maintaining “situational awareness” and designing “adaptive, accessible interfaces.” Passive users in autonomous systems tend to disengage, losing track of the environment, which can be dangerous during unexpected events. Effective HMI must offer context-sensitive feedback using visual, auditory, or haptic cues while adapting to the user’s state, experience level, and accessibility needs. Moreover, autonomous vehicles currently lack effective ways to interact with external actors—such as pedestrians or other drivers—replacing human cues like eye contact or gestures. Developing standardized, interpretable external HMIs, a language of driving, remains an active area of research. Finally, a lack of unified metrics and regulatory standards for evaluating HMI effectiveness further complicates design validation, making it difficult to compare systems or ensure safety across manufacturers.
Finally, autonomy will have implications on topics such as civil infrastructure guidance, field maintenance, interaction with emergency services, interaction with disabled and young riders, insurance markets, and most importantly the legal profession. There are many research issues underlying all of these topics.
In terms of business models, use models and their implications for supply chain are open research problems. For example, for the supply chain, the critical technology is semiconductors which is highly sensitive to very high volume. For example, the largest market in mobility, the auto industry, is approx. 10% of semiconductor volume, and the other forms (airborne, marine, space) are orders-of-magnitude lower. From a supply chain point perspective, a small number of skews which service a large market are ideal. The research problem is: What should be the nature of these very scalable components. In terms of end-markets, autonomy in traditional transportation is likely to lead to a reduction in unit volume. Why? With autonomy, one can get much higher utilization (vs the < 5% in today's automobiles). However, it is also likely that autonomy unleashes a broad class of solutions in markets such as agriculture, warehouses, distribution, delivery, and more. Micromobility applications in particular offer some interesting options for very high volumes. The exact nature of the applications is an open research problem.
In the subsequent chapters, we will delve deeper into these topics with a framing informed by autonomy abstractions as shown in the figure below. At the “bottom” of these abstractions are the physical objects such as the mechanical devices and the associated electronics hardware. Layered above the electronics hardware layer are various software layers which start with middleware/infrastructure, algorithmic layers, and finally the connection to humans.
These topics will be addressed at the conceptual level and also examined in specific fashion for the four physical domains (example figure below).
The underlying active physical components for all electronic systems are semiconductors. Semiconductors span several major categories based on function, material system, and integration level. At the most basic level are discrete devices such as diodes, MOSFETs, IGBTs, and rectifiers, which control current and voltage and are widely used in power conversion and motor drives. Analog and mixed-signal semiconductors handle sensing, amplification, signal conditioning, and power management (e.g., ADCs, DACs, voltage regulators, sensor interfaces). Memory semiconductors—such as DRAM, SRAM, NAND flash, and emerging non-volatile memories like MRAM—store data and program code. Power semiconductors use materials such as silicon, silicon carbide (SiC), and gallium nitride (GaN) to efficiently switch high voltages and currents in electric vehicles, aircraft power systems, and renewable energy converters. Finally, specialized devices such as RF front-end chips, image sensors (CMOS), FPGAs, and AI accelerators support communication, perception, and high-performance computing tasks. Together, these categories form the layered semiconductor ecosystem that underpins modern automotive, airborne, marine, and space electronic architectures. An important category is digital logic devices include microcontrollers (MCUs), microprocessors (MPUs), and system-on-chip (SoC) devices that execute programming of some form (FPGA, Software, AI). We shall discuss this in greater detail in the next chapter on software.
In this chapter, we shall review historical background to the absorption of semiconductors in various mobility domains. As a part of this background, we shall outline some key “productization” challenges such as safety, governance, and supply chain management. With this background, we will introduce the jump in complexity introduced by autonomy and revisit the key “productization” challenges.
Historically, cyber-physical systems were mechanically based, but with the advent of modern electronics, critical functions moved rapidly to electronics subsystems. For example, automotive electronics in the 1970s and early 1980s, tightening emissions standards in the U.S., Europe, and Japan pushed automakers to adopt microprocessor-based engine control units (ECUs). What began as simple ignition timing modules evolved into closed-loop engine management systems handling fuel injection and knock control— “Power Train” block shown in the graphic. These early semiconductor deployments were ruggedized analog/mixed-signal designs, optimized for reliability in high-temperature environments rather than computational complexity.
Through the late 1980s and 1990s, electronics expanded from powertrain into chassis and safety systems. Anti-lock braking systems (ABS), electronic stability control, traction control, and eventually electric power steering (EPS) required real-time sensing and actuation. This corresponds to the “Chassis” and “Safety and Control” domains in the image (ABS, airbag controllers, TPMS, collision warning). Here, semiconductors enabled distributed sensing (wheel speed sensors, accelerometers, pressure sensors) and deterministic embedded processing. The architecture remained domain-centric: each function had its own ECU, with limited cross-domain integration. The next wave, roughly 1995–2010, was driven less by regulation and more by consumer expectation. Vehicles became platforms for infotainment and comfort electronics, shown in the graphic’s “Infotainment” and “Comfort and Control” sections (dashboard displays, navigation, climate control, seat modules, body electronics). This phase marked the introduction of higher-performance digital SoCs, memory subsystems, and human-machine interface processors. Importantly, this is when in-vehicle networking standards such as CAN, LIN, and later FlexRay (listed under “Networking” ) became essential. The car shifted from isolated ECUs to a distributed electronic architecture connected by data buses—semiconductors were no longer just controllers; they were nodes in a communication network.
Figure 1: Automobile electronics
By the 2010s, semiconductor content per vehicle had grown exponentially, especially with hybrid and electric vehicles Power electronics (IGBTs, MOSFETs, later SiC devices), battery management systems, and high-voltage control loops dramatically increased the role of advanced semiconductor materials and mixed-signal integration. Simultaneously, advanced driver assistance systems (ADAS)—collision warning, parking assist, night vision—required vision processors, radar front-ends, and sensor fusion chips, extending the “Safety and Control” block into high-performance computing territory.
Airborne Sector
If the automotive graphic represents the distributed, domain-based maturation of electronics in cars, the airborne sector followed a similar—but more safety-critical and certification-driven—trajectory. In the early jet age (1950s–1970s), aircraft electronics—then called avionics—were largely analog and federated. Radar, navigation, flight instruments, engine monitoring, and autopilot systems were separate boxes with limited interconnection. Semiconductors initially replaced vacuum tubes for reliability and weight reduction, but computational capability was modest. Much like early automotive engine controllers, electronics were introduced to solve specific operational needs—navigation accuracy, radio communication, and flight stabilization—rather than to create an integrated digital platform. The major inflection point came in the 1980s and 1990s with the rise of digital flight control and “fly-by-wire” architectures, pioneered in civil aviation by aircraft such as the Airbus A320 and expanded in military platforms like the F-16 Fighting Falcon. Here, semiconductors moved from advisory roles to safety-critical control loops. Digital signal processors and radiation-tolerant microcontrollers executed deterministic real-time algorithms for stability augmentation, envelope protection, and engine control (FADEC).
During the 1990s–2000s, avionics entered a “glass cockpit” era. Aircraft such as the Boeing 777 replaced analog gauges with integrated digital displays driven by high-reliability processors and graphics subsystems. Data buses such as ARINC 429 and later AFDX (ARINC 664) enabled deterministic networking between flight computers, sensors, and displays—analogous to CAN and FlexRay in the automotive diagram. However, unlike automotive networks, airborne data buses were built around strict partitioning, redundancy, and fault containment regions. Triple-modular redundancy and dissimilar processors became common for flight-critical functions. In propulsion and power systems, semiconductors expanded from monitoring to active control. Full Authority Digital Engine Control (FADEC) units used mixed-signal ASICs and microprocessors to optimize fuel flow, reduce emissions, and improve reliability. With the emergence of “more-electric aircraft” concepts—exemplified by the Boeing 787—power electronics content increased substantially. High-voltage converters, motor drives, and solid-state power controllers replaced hydraulic subsystems, mirroring (though earlier in safety rigor) the electrification wave seen in automotive HEV/EV platforms.
Marine Sector
The marine industry’s use of electronics evolved from isolated navigation aids to highly integrated digital ship systems, following a trajectory structurally similar to automotive but at much larger power scales and with longer asset lifecycles. In the 1950s through the 1970s, marine electronics were primarily analog and functionally segregated: radar, sonar, gyrocompasses, VHF radios, and basic autopilots operated as standalone systems. Early semiconductor adoption focused on improving reliability and reducing size, particularly in radar and communication equipment. These systems were advisory in nature; propulsion and steering remained largely mechanical or hydraulic. The first major digital transition occurred in the 1980s and 1990s with the arrival of microprocessor-based engine control, satellite navigation (GPS), and electronic charting systems. Ships began incorporating digital propulsion governors, fuel optimization systems, and centralized alarm monitoring. This period resembles the automotive shift from carburetors to engine control units and ABS systems. Importantly, networking standards such as NMEA 0183 and later NMEA 2000 allowed sensors and navigation systems to exchange data, marking the move from isolated instrumentation to distributed marine electronics architectures.
By the 2000s, large commercial and naval vessels adopted Integrated Bridge Systems (IBS) and Integrated Platform Management Systems (IPMS), consolidating radar, charting, sonar, propulsion status, and safety alerts into unified digital consoles. Power electronics content increased significantly with electric propulsion drives, thruster control, hybrid marine power systems, and dynamic positioning systems. This phase mirrors the automotive expansion into electrification and body-domain integration. In recent years, semiconductor density has grown further with sensor fusion for collision avoidance, remote fleet monitoring, predictive maintenance, and early-stage autonomous surface vessels. While regulatory frameworks remain conservative, marine architecture now consists of interconnected propulsion, navigation, safety, power distribution, and autonomy subsystems — conceptually analogous to the domain blocks in the automotive graphic.
Space Sector
The space sector followed a parallel but more reliability-driven evolution, shaped by radiation tolerance, extreme environments, and mission assurance requirements. In the early space age, spacecraft electronics were built from discrete logic and radiation-hardened components with very limited computational capacity. Systems were strictly federated: guidance, telemetry, power conditioning, communications, and thermal control were separate subsystems with built-in redundancy. Early digital computers such as those used in the Apollo Guidance Computer demonstrated that semiconductors could enable autonomous navigation, but computational margins were minimal and fault tolerance was paramount. During the 1990s and early 2000s, radiation-hardened microprocessors and standardized spacecraft data buses such as MIL-STD-1553 and SpaceWire enabled more modular digital architecture. Satellites adopted structured subsystems for attitude determination and control, onboard data handling, payload processing, and power regulation. Missions like the Hubble Space Telescope and deep-space platforms such as the Mars Reconnaissance Orbiter incorporated increasingly sophisticated onboard processing for navigation, instrument control, and fault management. This stage resembles the distributed ECU era in automotive, where each domain was digitally controlled but interconnected via deterministic buses. In the modern era, semiconductor capability in space systems has expanded dramatically. High-throughput communications satellites, FPGA-based reconfigurable payloads, advanced solid-state power controllers, electric propulsion systems, and autonomous fault detection algorithms define current architectures. Commercial constellations developed by companies such as SpaceX have introduced vertically integrated avionics stacks and more software-defined spacecraft platforms. Unlike automotive, however, semiconductor design in space prioritizes radiation hardening, redundancy, and long-duration reliability over cost optimization. The overall trajectory mirrors the automotive diagram’s layered growth: from instrumentation digitization to closed-loop control, to networked subsystems, and now toward increasingly autonomous, software-defined space platforms.
Across marine and space domains — as in automotive — semiconductor adoption progressed from monitoring to control, from isolated subsystems to networked architecture, and from mechanical dominance to electrically and computationally mediated platforms. The architectural blocks differ in naming (propulsion, navigation, attitude control, power conditioning), but structurally they represent the same historical layering visible in the automotive figure.
As semiconductor content in vehicles increased, automotive safety protocols evolved from informal engineering practices to highly structured, lifecycle-based governance frameworks that now extend down to silicon IP and AI behavior. In the 1980s and 1990s, when electronic systems such as ABS and airbag controllers first became widespread, safety assurance was largely handled through company-specific processes. OEMs and Tier-1 suppliers relied on internal FMEA methods, redundancy design practices, and in some cases adaptations of aerospace guidance like DO-178 concepts. There was no unified automotive electronic safety standard, even as vehicles transitioned from isolated ECUs to increasingly networked systems.
The first major formal framework influencing automotive electronics was IEC 61508, published in 1998. IEC 61508 introduced Safety Integrity Levels (SILs), lifecycle safety management, probabilistic hardware fault metrics, and the concept of a structured safety case. However, it was designed as a generic standard for industrial programmable electronic systems. As vehicle architectures became more distributed and semiconductor complexity grew—moving from simple microcontrollers to multi-domain ECUs connected via CAN—automotive stakeholders recognized the need for a sector-specific adaptation.
That led to the publication of ISO 26262 in 2011. ISO 26262 was a transformative step, introducing Automotive Safety Integrity Levels (ASIL A–D), formal Hazard Analysis and Risk Assessment (HARA), hardware architectural metrics such as Single Point Fault Metric (SPFM) and Latent Fault Metric (LFM), and strict requirements traceability across the development lifecycle. Importantly, ISO 26262 directly influenced semiconductor design. Silicon vendors began offering ASIL-ready microcontrollers with lockstep CPU cores, ECC-protected memory, watchdog timers, and documented FMEDA data to support system integrators. Safety moved from being a vehicle-level validation exercise to being embedded in chip architecture and development processes.
The historical progression of safety protocols in airborne systems reflects the increasing reliance on semiconductors in avionics, flight control, and mission-critical software. Unlike automotive, aviation adopted structured safety governance very early, because electronics entered directly into safety-critical control loops such as autopilot and fly-by-wire. Also, increasing integration of custom ASICs and programmable logic devices in avionics led to the publication of DO-254 in 2000. DO-254 formalized design assurance for airborne electronic hardware, including FPGAs and complex microcircuits. It required documented development lifecycles, verification rigor proportional to hardware design assurance levels, and traceability from requirements to implementation.
For marine systems, as digital navigation and propulsion control systems expanded in the 1980s and 1990s, regulatory attention shifted toward reliability and redundancy of electronic systems. Classification societies such as DNV, Lloyd's Register, and American Bureau of Shipping developed rules for electrical and control systems onboard ships. These rules require redundancy in steering and propulsion control, fault tolerance in dynamic positioning systems, and environmental qualification of electronics for vibration, humidity, and salt exposure. The introduction of the Global Maritime Distress and Safety System (GMDSS) in the 1990s marked a major digital milestone. Satellite communications, automated distress signaling, and integrated bridge systems increased semiconductor density. As ships adopted Integrated Bridge Systems (IBS) and Integrated Platform Management Systems (IPMS), classification societies began issuing more formal guidance on software quality, failure mode analysis, and cyber resilience. Still, marine governance remained largely prescriptive and performance-based, rather than process-assurance-based.
Finally, space safety and electronics assurance evolved under extreme reliability constraints from the beginning, due to the impossibility of repair and the high cost of mission failure. Early space programs operated under agency-specific reliability and redundancy doctrines rather than formalized software standards. NASA and defense space agencies emphasized radiation hardening, hardware redundancy, and conservative design margins. Spacecraft have used fault detection, isolation, and recovery (FDIR) techniques from the outset.
Overall, safety standards have tracked the increased consumption of electronic systems.
As discussed in chapter 2, all of these systems live under a governance structure where validation and verification technology links the technical world to the governance structure. Critical in enabling these processes is the domain of Electronic Design Automation (EDA). EDA refers to the software tools and workflows used to design, verify, and prepare semiconductor devices and electronic systems for manufacturing. At the chip level, the flow typically begins with system architecture and specification, followed by separate but converging analog and digital design streams. In digital design, engineers describe functionality using hardware description languages (HDLs) such as Verilog or VHDL, simulate for functional correctness, synthesize to logic gates, and perform place-and-route to create a physical layout. This is followed by static timing analysis, power analysis, signal integrity checks, and increasingly, formal verification and functional safety validation (e.g., ISO 26262 contexts). In analog/mixed-signal design, the flow is more device- and layout-centric: schematic capture, SPICE-level simulation (corner, Monte Carlo, noise, mismatch), layout with careful parasitic extraction, and iterative verification (LVS/DRC). At advanced nodes, the boundary between analog and digital blurs in mixed-signal SoCs, requiring tight co-simulation and cross-domain verification.
Once the silicon design is complete, the flow extends to package design, which has become increasingly critical in advanced-node and heterogeneous integration contexts (e.g., chiplets, 2.5D/3D integration). Package EDA tools model signal integrity, power integrity, thermal behavior, and mechanical stress across substrates, interposers, and bumps. The package is no longer a passive carrier; it is an electrical extension of the die, affecting timing closure, power delivery, and high-speed interfaces (e.g., UCIe, HBM). Finally, at the PCB level, board design tools integrate schematic capture, component placement, routing, and multi-physics analysis (signal integrity, EMI/EMC, thermal). High-speed digital systems require co-design between chip I/O, package escape routing, and PCB stackup to maintain impedance control and timing margins. Modern EDA workflows increasingly emphasize cross-domain co-design—from transistor to board—because performance, reliability, and safety are emergent properties of the entire electronic system, not just the silicon alone.
The Electronic Design Automation (EDA) industry is highly concentrated, with dominant global vendors controlling the majority of advanced semiconductor design workflows. Synopsys, Cadence Design Systems, and Siemens EDA (formerly Mentor Graphics) collectively provide end-to-end toolchains spanning digital implementation, analog/mixed-signal design, verification, IP integration, packaging, PCB design, and multi-physics analysis. Synopsys is particularly strong in digital synthesis, verification, and IP; Cadence has deep capabilities in custom/analog design and system analysis; and Siemens EDA is well known for PCB design, verification, and manufacturing integration. Beyond the “big three,” companies such as Ansys play a critical role in sign-off physics (signal integrity, power integrity, thermal, electromagnetics), while emerging players focus on AI-assisted design automation and specialized domains like photonics or chiplet integration. The high technical complexity, deep foundry integration (e.g., with TSMC, Samsung, Intel), and massive R&D investment required at advanced nodes create significant barriers to entry, reinforcing the industry’s oligopolistic structure.
Physical testing of electronics spans wafer probe, packaged device qualification, board-level validation, and full system stress testing, and is supported by a concentrated set of global vendors. In semiconductor production test, automated test equipment (ATE) leaders such as Teradyne and Advantest dominate high-volume logic, memory, and SoC testing, enabling parametric characterization, functional verification, and speed binning at wafer and final test. For reliability and environmental stress—HTOL, temperature cycling, vibration, and humidity—chamber providers like ESPEC and Thermotron are widely used in automotive and aerospace qualification flows. Electrical measurement and compliance validation at the device and board level rely heavily on instrumentation from Keysight Technologies and Rohde & Schwarz, particularly for high-speed interfaces and RF systems. Inspection and failure analysis—critical for advanced packaging and heterogeneous integration—often leverage X-ray and acoustic microscopy systems from Nordson, as well as materials analysis platforms from Thermo Fisher Scientific. Together, these vendors underpin the physical validation layer that complements design verification, ensuring performance, reliability, and safety before deployment into mission-critical applications.
Another key aspect of governance is the management of shared resources. In the case of the mechanical world, this means laws and regulations in transportation in connection with traffic laws and the traffic infrastructure. In electronics, it means the management of the shared frequency spectrum and health safety issues. For shared use, in the US, the primary legal basis was the communication act passed in 1934 which created the regulator (Federal Communications Commission [FCC]). The FCC manages the radio spectrum (figure 1) through a range of regulatory and technical actions to ensure its efficient and interference-free use. It allocates specific frequency bands for various services—such as broadcasting, cellular, satellite, public safety, and amateur radio—based on national needs and international agreements. The FCC issues licenses to commercial and non-commercial users, setting terms for power limits, coverage areas, and operating conditions. It also conducts spectrum auctions to assign frequencies for commercial use, such as 5G, while reserving portions for public services and unlicensed uses like Wi-Fi.
In addition, the FCC enforces rules to prevent harmful interference, coordinates spectrum sharing and repurposing efforts, and leads initiatives like dynamic spectrum access and band reallocation to adapt to evolving technological demands. To enforce these standards, the FCC requires many devices to undergo testing and certification before they can be marketed or sold in the United States. This process is carried out by FCC-recognized testing laboratories, known as accredited Conformity Assessment Bodies (CABs), which evaluate products against applicable Part 15 or Part 18 regulations, among others. Certified devices must meet limits on emissions, immunity, and specific absorption rate (SAR) when applicable. Once a product passes testing, the lab submits a report to a Telecommunications Certification Body (TCB), which issues the FCC ID and authorizes the product for sale. These labs play a critical role in ensuring compliance, supporting innovation while maintaining spectrum integrity and public safety.
FCC Part 15 and Part 18 differ primarily in the type and purpose of radio frequency (RF) emissions they regulate. Part 15 governs devices that intentionally or unintentionally emit RF energy for communication purposes, such as Wi-Fi routers, Bluetooth devices, and computers. These devices must not cause harmful interference and must accept interference from licensed users. In contrast, Part 18 regulates Industrial, Scientific, and Medical (ISM) equipment that emits RF energy not for communication, but for performing physical functions like heating, welding, or medical treatments—examples include microwave ovens and RF diathermy machines. While both parts limit electromagnetic interference, Part 15 devices operate under stricter emissions limits due to their proximity to communication bands, whereas Part 18 devices are allowed higher emissions in designated ISM frequency bands. Additionally, health and safety regulations for Part 18 equipment are typically overseen by other agencies such as the FDA or OSHA, while the FCC focuses on interference mitigation.
A key instrument for electromagnetic testing is an anechoic chamber (figure 2). An anechoic chamber is a specialized, sound- and radio wave-absorbing enclosure designed to create an environment free from reflections and external interference. Its walls, ceiling, and floor are typically lined with wedge-shaped foam or ferrite tiles that absorb electromagnetic or acoustic waves, depending on the application. For radio frequency (RF) testing, the chamber is constructed with conductive materials (like steel or copper) to form a Faraday cage, isolating it from external RF signals. In acoustic chambers, sound-absorbing foam eliminates echoes and simulates free-field conditions. Anechoic chambers are critical in industries such as telecommunications, defense, aerospace, and consumer electronics, where they are used to test antenna performance, electromagnetic compatibility (EMC), emissions compliance, radar systems, or audio equipment in highly controlled, repeatable conditions. The chamber ensures that test measurements reflect only the characteristics of the device under test (DUT), without environmental interference.
All hardware in all the domains of interest (ground, airborne, marine, space) must comply with the FCC standards and in cases involving human contact, FDA standards for health and safety !
Finally, testing labs and services organizations play a critical role in certifying electronics against national and international standards, particularly for safety, electromagnetic compatibility (EMC), environmental robustness, and reliability. Global conformity assessment firms such as UL Solutions, TÜV SÜD, Intertek, and Bureau Veritas provide third-party testing and certification to standards such as IEC 61000 (EMC), IEC 62368 (product safety), ISO 26262 (automotive functional safety), DO-160 (aerospace environmental conditions), and MIL-STD-810 (defense environmental testing). These organizations operate accredited laboratories (often ISO/IEC 17025 certified) that conduct emissions and immunity testing, thermal cycling, vibration, ingress protection (IP), and safety evaluations required for CE marking, FCC authorization, automotive AEC qualification, and other regulatory approvals. In highly regulated sectors—automotive, aerospace, medical, and industrial—independent lab validation provides not only compliance evidence but also liability mitigation and market access assurance, making standards-driven testing an essential bridge between engineering validation and commercial deployment.
In product development, the initial focus is on functionality and differentiated value. As discussed in the governance sections, the next stage is to make sure the product conforms within the appropriate regulatory frameworks connected to safety and shared usage. The final stage and perhaps the most important stage is that of consistently delivering and supporting the product in the marketplace. To consistently deliver the product, one must manage the supply chain which drives the forward delivery of the product. In addition, as customers interact with the product, there is a reverse flow which involves reparability, diagnostics, and in most situations safe disposal.
For most products, the mechanical component supply chain, maintenance, and calibration have a well-formed rich history. As discussed, recent history has seen a large infusion of semiconductors. Supply Chain Management (SCM) refers to the strategic coordination of procurement, production, logistics, and distribution processes to ensure timely and cost-effective delivery of materials and systems [61]. The SCOR model, developed by the Supply Chain Council (SCC), is a widely used framework for designing and evaluating supply chains [62].
Each phase integrates digital tools and real-time analytics to ensure supply resilience and performance traceability.
Lean Supply Chain Management
Lean SCM focuses on minimizing waste (time, material, cost) across the chain while maximizing value for the customer [63]. In autonomous system production, Lean methods include:
Lean thinking improves agility in responding to rapid technological changes and component obsolescence.
Agile and Digital Supply Chains
Recent developments have introduced Agile Supply Chain concepts, emphasizing adaptability, visibility, and rapid reconfiguration [64]. Digital Supply Chain (DSC) technologies such as:
Risk Management and Resilience Building
Supply chain risk management (SCRM) in autonomous systems involves proactive identification and mitigation of disruptions:
AI-based SCRM tools (e.g., Resilinc, Everstream) now monitor supplier health and logistics delays in real time.
Challenges in Supply Chain Management
| Challenge | Description | Impact |
|---|---|---|
| Component Scarcity | Limited supplies for high-performance chips or sensors. | Production delays, increased cost. |
| Globalization Risks | Dependence on international logistics and trade. | Exposure to geopolitical instability. |
| Quality Variability | Inconsistent supplier quality control. | Rework and testing overhead. |
| Cybersecurity Threats | Counterfeit or tampered components. | System failure or security breaches. |
| Data Supply Issues | Dependence on labelled datasets or simulation platforms. | Delayed AI development or bias introduction. |
Environmental and Ethical Constraints Supply chains for autonomy-related technologies often rely on materials such as lithium, cobalt, and rare earth metals used in sensors and batteries. Ethical sourcing, sustainability, and carbon accountability are now critical supply chain dimensions [53].
Example: Regulations aimed at preventing the sourcing of minerals from conflict-affected regions—particularly in parts of Central Africa—focus on “conflict minerals” such as tin, tungsten, tantalum, and gold (3TG). In the United States, Section 1502 of the Dodd-Frank Wall Street Reform and Consumer Protection Act requires publicly traded companies to conduct due diligence and disclose whether these minerals originated from the Democratic Republic of the Congo or adjoining countries, while the European Union enforces similar supply-chain due diligence under the EU Conflict Minerals Regulation. These frameworks compel companies to trace supply chains, implement risk mitigation processes aligned with OECD guidance, and publicly report sourcing practices to reduce the financing of armed groups.
The Rise of Supply Chain Cybersecurity As hardware and software become interconnected, supply chain cybersecurity has emerged as a critical risk domain. Compromised firmware or cloned microcontrollers can introduce vulnerabilities deep within a system’s hardware root of trust [54]. Security frameworks such as NIST SP 800-161, ISO/IEC 27036, and Cybersecurity Maturity Model Certification (CMMC) are being applied to mitigate these threats.
Ground Systems:
In terms of ground systems, the automotive industry has evolved over time to a very optimized supplier structure with Original Equipment Manufacturers (OEMs), tiered series of suppliers (Table 1).
| Level | Supplier |
|---|---|
| OEM | BMW, Ford, GM, Mercedes-Benz, Toyota, etc. |
| Infrastructure | Government (federal, state, local), cellular (safety), map applications, etc. |
| Tier 1 (Systems) | Continental, Delphi, Bosch, Denso, etc. |
| Tier 2 (Parts) | Texas Instruments, NXP, TDK, Yazaki, Bridgestone, etc. |
| Tier 3 (Materials) | 3M, DuPont, BASF, Shin-Etsu, etc. |
Table 1. Short lifecycle versus LLC products.
Further, much like the US Department of Defense, automotive companies traditionally require chips with automotive grade certification. Automotive-grade components require stringent compliances. (Passive components need AEC Q200, ASILI/ISO 26262 Class B, IATF 16949 qualification while active components, including automotive chips, should be compliant with AEC Q100, ASILI/ISO 26262 Class B, IATF 16949 standards).
Airborne (Aerospace)
In aerospace, the supply chain evolved around regulatory certification authority and system safety long before cost optimization became dominant. As aircraft systems transitioned from analog to fly-by-wire and software-intensive architectures, standards such as DO-178 (software), DO-254 (hardware), and ARP4754 (system development) forced a structural shift: Tier-1 suppliers became deeply embedded in certification artifacts, not just hardware delivery. Companies such as Honeywell and Raytheon Technologies (Collins Aerospace) do not merely supply components; they co-own verification evidence, safety analyses, and traceability matrices required by the FAA/EASA. This creates a tightly coupled, long-cycle ecosystem where primes like Boeing act as system-of-systems integrators, and switching suppliers is extremely costly due to certification recertification burdens. The airborne model therefore evolved into a high-barrier, risk-sharing, assurance-centric hierarchy.
Marine
Marine supply chains historically centered on shipyards and mechanical systems, with less formalized tier structures than aerospace. Oversight came from classification societies (e.g., DNV, ABS) rather than centralized regulators, and vessels were often semi-custom builds. However, as digital navigation, dynamic positioning, and now autonomy have increased system complexity, Tier-1 marine technology firms such as Kongsberg Gruppen and Wärtsilä have moved closer to aerospace-style system integration roles. Unlike automotive’s scale-driven tiers, marine tiers evolved around project integration and compliance with flag-state and class requirements. The current autonomy push is accelerating a transition toward software-centric supply chains, but production volume remains low and customization remains high, keeping marine structurally more fragmented than aerospace.
Space
The space industry began as a vertically integrated, government-driven ecosystem dominated by primes such as Lockheed Martin and Boeing under cost-plus contracts with agencies like NASA and the DoD. Reliability and mission assurance, not cost efficiency, defined supplier relationships, and specialized radiation-hardened component vendors formed niche Tier-2/3 layers. In the last decade, however, companies like SpaceX have reintroduced vertical integration to compress development cycles and control risk across propulsion, avionics, and launch operations. The result is a bifurcated supply chain: one high-assurance national security chain with traditional tier structures, and one commercially agile “NewSpace” chain that blends COTS components with vertically integrated primes. Certification and mission risk, rather than volume economics, remain the dominant structural forces.
Semiconductor Economics:
The cost of building a semiconductor device is dominated by three interacting factors: design (NRE), wafer fabrication, and volume, all of which are tightly linked to lithography node. At advanced nodes (e.g., 5 nm, 3 nm), non-recurring engineering (NRE) costs can exceed hundreds of millions of dollars due to mask sets, EDA complexity, verification effort, and IP integration, while wafer costs rise sharply because of EUV lithography, tighter process control, and lower initial yields. As a result, cutting-edge nodes only make economic sense at very high production volumes, where fixed design and mask costs can be amortized over millions of units; otherwise, the cost per die becomes prohibitive. Conversely, mature nodes (e.g., 28 nm, 40 nm, 65 nm) have far lower mask and wafer costs, stable yields, and shorter development cycles, making them economically attractive for automotive, industrial, and mixed-signal applications where performance density is less critical and production volumes may be moderate rather than massive.
Production volumes differ markedly between advanced and mature semiconductor nodes because of economics and application mix. Advanced nodes (e.g., 5 nm, 3 nm) are typically justified only for extremely high-volume markets such as flagship smartphones, data-center CPUs/GPUs, and AI accelerators, where tens of millions—or even hundreds of millions—of units can amortize enormous design and mask costs. In contrast, mature nodes (e.g., 28 nm, 40 nm, 65 nm and above) support a much broader diversity of products—automotive MCUs, power management ICs, analog, RF, and industrial controllers—often produced in moderate but long-lived volumes over many years. While individual mature-node programs may ship fewer units annually than leading-edge mobile processors, the aggregate volume across applications is extremely large and more stable over time, which explains why mature-node capacity remains strategically important despite the industry’s focus on leading-edge scaling.
Today, automotive volumes are sufficient to drive unique semiconductor designs on mature nodes, but generally all the cyber-physical domains must use standard parts.
Embedded protocols (inc. domain-specific), sensors, actuators, long and short distance communication and components, navigation and positioning.
From a hardware perspective, the big jump in functionality is the introduction of sensors, the computation to interpret the world, and then actuation to provide autonomy.
Ground.
The graphic illustrates the multi-layered sensor stack typically required for autonomous vehicles, combining complementary sensing modalities to achieve redundancy, range coverage, and environmental robustness. At the longest ranges, long-range radar and forward-facing cameras provide early detection of vehicles, obstacles, and road geometry. Long-range radar operates reliably in rain, fog, and low-light conditions, measuring object distance and relative velocity using Doppler shifts. Cameras, on the other hand, provide high-resolution semantic information—lane markings, traffic signs, traffic lights, and object classification (car vs. pedestrian vs. cyclist). While cameras excel at classification, they are more sensitive to lighting and weather, which is why radar redundancy is essential for safety-critical functions such as adaptive cruise control and highway autopilot.
In the mid- to short-range envelope around the vehicle, short-range radar and LiDAR (Light Detection and Ranging) enhance situational awareness. Short-range radar monitors adjacent lanes, blind spots, and cross-traffic. LiDAR provides high-precision 3D point clouds, enabling accurate mapping of object contours, free space, and road boundaries. LiDAR is particularly valuable for precise localization and obstacle detection in urban environments. Together, these sensors support functions like lane changes, merging, intersection handling, and obstacle avoidance. Very close to the vehicle, ultrasonic sensors and near-field cameras provide low-speed maneuvering awareness. Ultrasonic sensors detect curbs, parking barriers, and nearby objects within a few meters, enabling parking assist and tight maneuvering. Surround-view camera systems support 360-degree perception for low-speed autonomy and automated parking. Overlaying all these sensing layers is vehicle-to-everything (V2X) or wireless communication, which extends perception beyond line-of-sight by exchanging information with infrastructure and other vehicles. Collectively, the autonomy stack relies on sensor fusion—combining radar robustness, camera semantics, LiDAR precision, and ultrasonic proximity—to create a reliable environmental model suitable for safety-critical decision-making.
In terms of computation, autonomous ground vehicles require high-throughput, low-latency edge computation to process multi-modal sensor streams (camera, radar, LiDAR, ultrasonic) in real time. The compute stack typically integrates heterogeneous architectures—CPUs for control logic, GPUs/NPUs for deep neural network inference, and dedicated safety microcontrollers running ISO 26262–compliant software. These platforms must handle perception (object detection, segmentation), localization (SLAM, sensor fusion), prediction (trajectory forecasting), and planning (path optimization) within tens of milliseconds, all under automotive thermal and power constraints. Redundant compute paths and lockstep processors are often used to meet functional safety goals, with over-the-air update capability enabling continuous improvement.
Airborne.
Airborne autonomous systems rely on a fusion of inertial, air-data, navigation, and external perception sensors to operate in a 3D, high-speed, safety-critical environment. Core sensors include Inertial Measurement Units (IMUs) and air-data systems (pitot tubes, angle-of-attack vanes) for attitude and aerodynamic state estimation; multi-constellation GNSS for global positioning; and radar altimeters for precise height above ground during landing. For obstacle and traffic detection, aircraft increasingly use weather radar, ADS-B receivers (traffic awareness), electro-optical/infrared (EO/IR) cameras, and sometimes LiDAR for detect-and-avoid (particularly in UAVs). Unlike ground systems, airborne autonomy must handle sparse landmarks, high closing speeds, and large vertical envelopes. Sensor reliability and redundancy are critical, with cross-checking between inertial and external navigation sources to meet aviation safety requirements.
Airborne autonomous computation prioritizes determinism, certification traceability, and fault tolerance over raw AI throughput. Flight-critical systems must comply with DO-178C (software) and DO-254 (hardware), which emphasizes verified, bounded execution and rigorous testing. Compute platforms are often partitioned using time- and space-separation (e.g., ARINC 653 architectures), ensuring that autonomy functions cannot interfere with flight controls. Compared to automotive, airborne compute may use less cutting-edge silicon but emphasizes redundancy (triple modular redundancy, cross-monitoring processors) and deterministic real-time operating systems. Power and weight constraints are critical, and thermal management must accommodate altitude-related cooling limits.
Marine.
Marine autonomous vessels operate in a reflective, cluttered, and dynamically changing surface environment. Primary sensors include marine radar (long-range detection in fog and rain), GNSS for global positioning, and high-grade IMUs for heading and motion stabilization. Automatic Identification System (AIS) receivers provide cooperative vessel tracking, while optical cameras assist in visual interpretation. COLREG maritime rules must be followed for near-field awareness, vessels employ optical cameras, thermal cameras (night and low visibility), sonar (for subsurface obstacle detection), and depth sounders to prevent grounding. Compared to ground systems, marine sensing must manage wave motion, multipath reflections from water, salt corrosion, and very long detection ranges with sparse infrastructure. Subsurface autonomy (e.g., AUVs) further depends on acoustic positioning and Doppler velocity logs because GNSS is unavailable underwater.
Marine autonomy computation operates in a lower-speed but highly variable environment, often combining onboard compute with shore-based or cloud-assisted systems. Vessels may employ robust industrial-grade processors running perception and navigation stacks for radar, AIS (Automatic Identification System), sonar, and camera inputs. Because marine systems often operate for extended durations at sea, energy efficiency and environmental hardening (salt, humidity, vibration) are important. Autonomy compute must integrate route optimization, collision avoidance (COLREGs compliance), and remote monitoring, sometimes with partial human oversight. Unlike aerospace, certification is less centralized, allowing somewhat more flexibility in compute architectures.
Space.
Space autonomy operates in an extreme, infrastructure-free environment where navigation and state awareness rely heavily on inertial, optical, and celestial sensing. Satellites use star trackers for ultra-precise attitude determination, sun sensors for coarse orientation, and gyroscopes for angular rate measurement. GNSS receivers may be used in low Earth orbit, but deep-space missions rely on onboard optical navigation (planet/star tracking), LIDAR altimeters (for planetary landing), and radar for surface mapping. Proximity operations (e.g., docking, formation flying) use vision-based navigation and relative LIDAR or radar sensors. Unlike ground, airborne, or marine systems, space sensors must withstand radiation, vacuum, and extreme temperature cycles, and they often operate with minimal real-time human supervision due to communication latency. Sensor fusion in space emphasizes fault detection, graceful degradation, and long-duration reliability over raw environmental density.
Space autonomy computation is constrained by radiation tolerance, power availability, and communication latency. Traditional space systems use radiation-hardened processors with lower clock speeds but extremely high reliability and error-correction capabilities. Increasingly, commercial and “NewSpace” missions incorporate higher-performance COTS processors with shielding and fault detection to enable onboard AI for navigation, fault management, and autonomous operations (e.g., satellite constellation management or planetary landing). Because communication delays can be minutes or longer, deep-space systems must support autonomous decision-making with minimal ground intervention. Fault tolerance, graceful degradation, and long mission lifetimes (often 10–20+ years) dominate architectural design choices.
In terms of challenges, autonomy is very much in the early innings. Broadly speaking, the challenges can be split into three broad categories. First, the core technology elements within the autonomy pipeline (sensors, location services, perception, and path planning, the algorithms and methodology for demonstrating safety, and finally business economics.
Autonomous vehicles rely on a suite of sensors—such as LiDAR, radar, cameras, GPS, and ultrasonic devices—to perceive and interpret their surroundings. However, each of these sensor types faces inherent limitations, particularly in challenging environmental conditions. Cameras struggle with low light, glare, and weather interference like rain or fog, while LiDAR can suffer from backscatter in fog or snow. Radar, though more resilient in poor weather, provides lower spatial resolution, making it less effective for detailed object classification. These environmental vulnerabilities reduce the reliability of perception systems, especially in safety-critical scenarios. Another major challenge lies in the integration of multiple sensor types through sensor fusion. Achieving accurate, real-time fusion demands precise temporal synchronisation and spatial calibration, which can drift over time due to mechanical or thermal stresses. Furthermore, sensors are increasingly exposed to cybersecurity threats. GPS and LiDAR spoofing, or adversarial attacks on camera-based recognition systems, can introduce false data or mislead decision-making algorithms, necessitating robust countermeasures at both the hardware and software levels. Sensor systems also face difficulties with occlusion and semantic interpretation. Many sensors require line-of-sight to function properly, so their performance degrades in urban settings with visual obstructions like parked vehicles or construction. Even when objects are detected, understanding their intent—such as whether a pedestrian is about to cross the street—remains a challenge for machine learning models. Meanwhile, high-resolution sensors generate vast data streams, straining onboard processing and communication bandwidth, and creating trade-offs between resolution, latency, and energy efficiency. Lastly, practical concerns such as cost, size, and durability hinder mass adoption. LiDAR units, while highly effective, are often expensive and mechanically complex. Cameras and radar must also be ruggedised to withstand weather and vibration without degrading in performance. Compounding these issues is the lack of standardised validation methods to assess sensor reliability under varied real-world conditions, making it difficult for developers and regulators to establish trust and ensure safety across diverse operational domains.
The “perception system” is at the core of autonomous vehicle functionality, enabling the car to understand and interpret its surroundings in real time. It processes data from multiple sensors—cameras, LiDAR, radar, and ultrasonic devices—to detect, classify, and track objects. The perception system struggles with “semantic understanding and edge cases.” While object detection and classification have improved with deep learning, these models often fail in rare or unusual scenarios—like an overturned vehicle, a pedestrian in costume, or construction detours. Understanding the context and intent behind actions (e.g., whether a pedestrian is about to cross) is even harder. This lack of true situational awareness can lead to poor decision-making and is a key challenge for Level 4 and 5 autonomy. Also, the “computational burden” of real-time perception—especially with high-resolution inputs—creates constraints in terms of processing power, thermal management, and latency. Balancing model accuracy with speed and ensuring system performance across embedded platforms is a persistent engineering challenge.
Location services—often referred to as localisation—are essential to autonomous vehicles (AVs), enabling them to determine their precise position within a map or real-world environment. While traditional GPS offers basic positioning, autonomous vehicles require “centimetre-level accuracy,” robustness, and real-time responsiveness, all of which present significant challenges. One major challenge is the “limited accuracy and reliability of GNSS (Global Navigation Satellite Systems)” such as GPS, especially in urban canyons, tunnels, or areas with dense foliage. Buildings can block or reflect satellite signals, leading to multi-path errors or complete signal loss. While techniques like Real-Time Kinematic (RTK) correction and augmentation via ground stations improve accuracy, these solutions can be expensive, infrastructure-dependent, and still prone to failure in GNSS-denied environments. To compensate, AVs often combine GPS with “sensor-based localisation,” including LiDAR, cameras, and IMUs (inertial measurement units), which enable map-based and dead-reckoning approaches. Sensor-based dead reckoning using IMUs and odometry can help bridge short GNSS outages, but “drift accumulates over time,” and errors can compound, especially during sharp turns, vibrations, or tyre slippage. Finally, “map-based localisation” depends on the availability of high-definition (HD) maps that include detailed features like lane markings, curbs, and traffic signs. These maps are costly to build and maintain, and they can become outdated quickly due to road changes, construction, or temporary obstructions—leading to mislocalization.
Path planning in autonomous vehicles is a complex and safety-critical task that involves determining the vehicle's trajectory from its current position to a desired destination while avoiding obstacles, complying with traffic rules, and ensuring passenger comfort. One of the most significant challenges in this area is dealing with dynamic and unpredictable environments. The behaviour of other road users—such as pedestrians, cyclists, and human drivers—can be erratic, requiring the planner to continuously adapt in real time. Predicting these agents' intentions is inherently uncertain and often leads to either overly cautious or unsafe behaviour if misjudged. Real-time responsiveness is another major constraint. Path planning must be executed with low latency while factoring in a wide range of considerations, including traffic laws, road geometry, sensor data, and vehicle dynamics. This requires balancing optimality, safety, and computational efficiency within strict time limits. Additionally, the planner must account for the vehicle’s physical constraints, such as turning radius, acceleration, and braking limits, especially in complex manoeuvres like unprotected turns or obstacle avoidance. Another persistent challenge is operating with incomplete or noisy information. Sensor occlusion, poor weather, or localisation drift can obscure critical details such as road markings, traffic signs, or nearby objects. Planners must therefore make decisions under uncertainty, which adds complexity and risk. Moreover, the vehicle must navigate complex and often-changing road topologies—like roundabouts, construction zones, or temporary detours—where map data may be outdated or ambiguous. Finally, the need for continuous replanning introduces issues of robustness and comfort. The path planning system must frequently adjust trajectories to respond to new inputs, but abrupt changes can degrade ride quality or destabilise the vehicle. All of this must be done while maintaining rigorous safety guarantees, ensuring that every planned path can be verified as collision-free and legally compliant. Developing a system that meets these demands across diverse environments and edge cases remains one of the toughest challenges in achieving fully autonomous driving.
Algorithms and Methodology for Safety:
A major bottleneck remains the inability to fully validate AI behaviour, with a need for more rigorous methods to assess completeness, generate targeted test cases, and bound system behaviour. Advancements in explainable AI, digital twins, and formal methods are seen as promising paths forward. Additionally, current systems lack scalable abstraction hierarchies—hindering the ability to generalise component-level validation to system-level assurance. To build trust with users and regulators, the industry must also adopt a “progressive safety framework,” clearly showing continuous improvement, regression checks during over-the-air (OTA) updates, and lessons learned from real-world failures.
In terms of “V&V test apparatuses,” both virtual and physical tools are emphasised. Virtual environments will play a key role in supporting evolving V&V methodologies, necessitating ongoing work from standards bodies like ASAM. Physical test tracks must evolve to not only replicate real-world scenarios efficiently but also validate the accuracy of their virtual counterparts—envisioned through a “movie set” model that can quickly stage complex scenarios. Another emerging concern is “electromagnetic interference (EMI),” especially due to the widespread use of active sensors. Traditional static EMI testing methods are insufficient, and there is a need for dynamic, programmable EMI testing environments tailored to cyber-physical systems.
Finally, a rising concern is around cybersecurity in autonomous systems. These systems introduce systemic vulnerabilities that span from hardware to software, necessitating government-level oversight. Key sensor modalities like LiDAR, GPS, and radar are susceptible to spoofing, and detecting such threats is an urgent research priority. The V&V process itself must evolve to minimise exposure to adversarial attacks, effectively treating security as an intrinsic constraint within system validation, not an afterthought.
Business Models and Supply Chain:
Robo-taxis, or autonomous ride-hailing vehicles, represent a promising use case for autonomous vehicle (AV) technology, with the potential to transform urban mobility by offering on-demand, driverless transportation. Key use models include urban ride-hailing in city centres, first- and last-mile transit to connect riders with public transportation, airport and hotel shuttle services in geofenced areas, and mobility on closed campuses like universities or corporate parks. These models aim to increase vehicle utilization, reduce transportation costs, and offer greater convenience, particularly in environments where human-driver costs are a major factor. However, the business challenges are substantial. The development and deployment of robo-taxi fleets require enormous capital investment in hardware, software, testing, and infrastructure. Operational costs remain high, particularly in the early stages when human safety drivers, detailed maps, and limited deployment zones are still necessary. Regulatory uncertainty also hampers scalability, with different jurisdictions applying inconsistent safety, insurance, and operational standards. This makes expansion slow and costly.
In addition, consumer trust in autonomous systems remains fragile. High-profile incidents have raised safety concerns, and many riders may be hesitant to use driverless vehicles, especially in unfamiliar or emergency situations. Infrastructure constraints—such as poor road markings or limited connectivity—further limit the environments in which robo-taxis can operate reliably. Meanwhile, the path to profitability is challenged by competitive fare pricing, fleet maintenance logistics, and integration with broader transportation networks. Overall, while robo-taxis offer significant long-term promise, their success hinges on overcoming a complex mix of technological, regulatory, and business barriers.
The evolving economics of the semiconductor industry pose a significant challenge for low-volume markets, where custom chip development is often not cost-effective. As a result, autonomous and safety-critical systems must increasingly rely on Commercial Off-The-Shelf (COTS) components, making it essential to develop methodologies that can ensure security, reliability, and performance using these standardised parts. This shift places greater emphasis on designing systems that are resilient and adaptable, even without custom silicon. Additionally, traditional concerns like field maintainability, lifetime cost, and design-for-supply-chain practices—common in mechanical and industrial engineering—must now be applied to electronics and embedded systems. As electronic components dominate modern products, a more holistic design approach is needed to manage downstream supply chain implications. The trend toward software-defined vehicles reflects this need, promoting deeper integration between hardware and software suppliers. To further enhance supply chain resilience, there's a push to standardise around a smaller set of high-volume chips and embrace flexible, programmable hardware fabrics that integrate digital, analogue, and software elements. This architecture shift is key to mitigating supply disruptions and maintaining long-term system viability. Finally, “maintainability” also implies the availability of in-field repair facilities, which must be upgraded to handle autonomy.
Autonomous vehicles place extraordinary demands on their sensing stack. Cameras, LiDARs, radars, and inertial/GNSS units do more than capture the environment—they define the limits of what the vehicle can possibly know. A planner cannot avoid a hazard it never perceived, and a controller cannot compensate for latency or drift it is never told about. Sensor validation therefore plays a foundational role in safety assurance: it characterizes what the sensors can and cannot see, how those signals are transformed into machine-interpretable entities, and how residual imperfections propagate into system-level risk within the intended operational design domain (ODD).
In practice, validation bridges three layers that must remain connected in the evidence trail. The first is the hardware layer, which concerns intrinsic performance such as resolution, range, sensitivity, and dynamic range; extrinsic geometry that pins each sensor into the vehicle frame; and temporal behavior including latency, jitter, timestamp accuracy, and clock drift. The second is the signal-to-perception layer, where raw measurements are filtered, synchronized, fused, and converted into maps, detections, tracks, and semantic labels. The third is the operational layer, which tests whether the sensing system—used by the autonomy stack as deployed—behaves acceptably across the ODD, including rare lighting, weather, and traffic geometries. A credible program links evidence across these layers to a structured safety case aligned with functional safety (ISO 26262), SOTIF (ISO 21448), and system-level assurance frameworks, making explicit claims about adequacy and known limitations.
The overarching aim is not merely to pass tests but to bound uncertainty and preserve traceability. For each modality, the team seeks a quantified understanding of performance envelopes: how detection probability and error distributions shift with distance, angle, reflectivity, ego speed, occlusion, precipitation, sun angle, and electromagnetic or thermal stress. These envelopes are only useful when translated into perception key performance indicators and, ultimately, into safety metrics such as minimum distance to collision, time-to-collision thresholds, mission success rates, and comfort indices. Equally important is traceability from a system-level outcome back to sensing conditions and processing choices—so a late failure can be diagnosed as calibration drift, timestamp skew, brittle ground filtering, overconfident tracking, or a planner assumption about obstacle contours. Validation artifacts—calibration reports, timing analyses, parameter-sweep results, and dataset manifests—must therefore be organized so that claims in the safety case are backed by reproducible evidence.
The bench begins with geometry and time. Intrinsic calibration (for cameras: focal length, principal point, distortion; for LiDAR: channel angles and firing timing) ensures raw measurements are geometrically meaningful, while extrinsic calibration fixes rigid-body transforms among sensors and relative to the vehicle frame. Temporal validation establishes timestamp accuracy, cross-sensor alignment, and end-to-end latency budgets. Small timing mismatches that seem benign in isolation can yield multi-meter spatial discrepancies during fusion, particularly when tracking fast-moving actors or when the ego vehicle is turning. Modern stacks depend on this foundation: a LiDAR–camera fusion pipeline that projects point clouds into image coordinates requires both precise extrinsics and sub-frame-level temporal alignment to avoid ghosted edges and misaligned semantic labels. Calibration is not a one-off event; temperature cycles, vibration, and maintenance can shift extrinsics, and firmware updates can alter timing. Treat calibration and timing as monitorable health signals with periodic self-checks—board patterns for cameras, loop-closure or NDT metrics for LiDAR localization, and GNSS/IMU consistency tests—to catch drift before it erodes safety margins.
Validation must extend beyond the sensor to the pre-processing and fusion pipeline. Choices about ground removal, motion compensation, glare handling, region-of-interest cropping, or track-confirmation logic can change effective perception range and false-negative rates more than a nominal hardware swap. Controlled parameter sensitivity studies are therefore essential. Vary a single pre-processing parameter over a realistic range and measure how first-detection distance, false-alarm rate, and track stability evolve. These studies are inexpensive in simulation and surgical on a test track, and they surface brittleness early, before it appears as uncomfortable braking or missed obstacles in traffic. Notably, changes to LiDAR ground-filter thresholds can shorten the maximum distance at which a stopped vehicle is detected by tens of meters, shaving seconds off reaction time and elevating risk—an effect that should be measured and tied explicitly to safety margins.
Perception KPIs must be defined with downstream decisions in mind. Aggregate AUCs are less informative than scoped statements such as “stopped-vehicle detection range at ninety-percent recall under dry daylight urban conditions.” Localization health is better expressed as a time-series metric correlated with map density and scene content than as a single RMS figure. The aim is to generate metrics a planner designer can reason about when setting buffers and behaviors. These perception-level KPIs should be linked to system-level safety measures—minimum distance to collision, collision occurrence, braking aggressiveness, steering smoothness—so that changes in sensing or pre-processing can be convincingly shown to increase or decrease risk.
One of the interesting consequences of sensors calibration is the requirement to build calibration capability in the maintenance capabilities for the products.
Miles driven is a weak proxy for sensing assurance. What matters is which situations were exercised and how well they cover the risk landscape. Scenario-based validation replaces ad-hoc mileage with structured, parameterized scenes that target sensing stressors: low-contrast pedestrians, vehicles partially occluded at offset angles, near-horizon sun glare, complex specular backgrounds, or rain-induced attenuation. Scenario description languages allow these scenes to be specified as distributions over positions, velocities, behaviors, and environmental conditions, yielding reproducible and tunable tests rather than anecdotal encounters. Formal methods augment this process through falsification—automated searches that home in on configurations most likely to violate monitorable safety properties, such as maintaining a minimum separation or confirming lane clearance for a fixed dwell time. This formalism pays two dividends: it turns vague requirements into properties that can be checked in simulation and on track, and it exposes precise boundary conditions where sensing becomes fragile, which are exactly the limitations a safety case must cite and operations must mitigate with ODD constraints.
High-fidelity software-in-the-loop closes the gap between abstract scenarios and the deployed stack. Virtual cameras, LiDARs, and radars can drive the real perception software through middleware bridges, enabling controlled reproduction of rare cases, precise occlusions, and safe evaluation of updates. But virtual sensors are models, not mirrors; rendering pipelines may fail to capture radar multipath, rolling-shutter distortions, wet-road reflectance, or the exact beam divergence of a specific LiDAR. The simulator should therefore be treated as an instrument that requires its own validation. A practical approach is to maintain paired scenarios: for a subset of tests, collect real-world runs with raw logs and environmental measurements, then reconstruct them in simulation as faithfully as possible. Compare detection timelines, track stability, and minimum-distance outcomes, and quantify the divergence with time-series metrics such as dynamic time warping on distance profiles, discrepancies in first-detection timestamps, and divergence in track IDs. The goal is not to erase the sim-to-real gap—an unrealistic aim—but to bound it and understand where simulation is conservative versus optimistic.
Because budgets are finite, an efficient program adopts a two-layer workflow. The first layer uses faster-than-real-time, lower-fidelity components to explore large scenario spaces, prune uninformative regions, and rank conditions by estimated safety impact. The second layer replays the most informative cases in a photorealistic environment that streams virtual sensor data into the actual autonomy stack and closes the control loop back to the simulator. Both layers log identical KPIs and time-aligned traces so results are comparable and transferable to track trials. This combination of breadth and fidelity uncovers corner cases quickly, quantifies their safety implications, and yields ready-to-execute test-track procedures for final confirmation.
Modern validation must encompass accidental faults and malicious interference. Sensors can be disrupted by spoofing, saturation, or crafted patterns; radars can suffer interference; GPS can be jammed or spoofed; IMUs drift. Treat these as structured negative test suites, not afterthoughts. Vary spoofing density, duration, and geometry; inject glare or saturation within safe experimental protocols; simulate or hardware-in-the-loop radar interference; and record how perception KPIs and system-level safety metrics respond. The objective is twofold: quantify degradation—how much earlier does detection fail, how often do tracks drop—and evaluate defenses such as cross-modality consistency checks, health-monitor voting, and fallbacks that reduce speed and increase headway when sensing confidence falls below thresholds. This work connects directly to SOTIF by exposing performance-limited hazards amplified by adversarial conditions, and to functional safety by demonstrating safe states under faults.
Validation produces data, but assurance requires an argument. Findings should be organized so that each top-level claim—such as adequacy of the sensing stack for the defined ODD—is supported by clearly scoped subclaims and evidence: calibrated geometry and timing within monitored bounds; modality-specific detection and tracking KPIs across representative environmental strata; quantified sim-to-real differences for critical scenes; scenario-coverage metrics that show where confidence is high and where operational mitigations apply; and results from robustness and security tests. Where limitations remain—as they always do—they should be stated plainly and tied to mitigations, whether that means reduced operational speed in heavy rain beyond a specified attenuation level, restricted ODD where snow eliminates lane semantics, or explicit maintenance intervals for recalibration.
A final pragmatic recommendation is to treat validation data as a first-class product. Raw logs, configuration snapshots, and processing parameters should be versioned, queryable, and replayable. Reproducibility transforms validation from a hurdle into an engineering asset: when a perception regression appears after a minor software update, the same scenarios can be replayed to pinpoint the change; when a new sensor model is proposed, detection envelopes and safety margins can be compared quickly and credibly. In this way, the validation of perception sensors becomes a disciplined, scenario-driven program that ties physical sensing performance to perception behavior and ultimately to system-level safety outcomes, while continuously informing design choices that make the next round of validation faster and more effective.
Governance and Safety Challenges:
What are the implications for automakers? In modern vehicles, electronics are no longer confined to infotainment or engine control—sensors, communication modules, and controllers are now central to vehicle safety and performance. These systems emit and receive electromagnetic energy, which can result in electromagnetic interference (EMI) if not properly managed. EMI can compromise safety-critical applications like radar- based adaptive cruise control or camera-based lane keeping. Sensor technologies introduce unique EMI challenges. Radar and lidar sensors, which are critical for driver assistance and autonomous systems, must not only avoid interference with each other but must also operate within spectrum allocations defined by the FCC and global bodies like the ITU. Similarly, cameras and ultrasonic sensors are susceptible to noise from nearby power electronics, especially in electric vehicles. EMI from poorly shielded cables or high-frequency switching components can cause data corruption, missed detections, or degraded signal integrity—raising both functional safety and regulatory concerns.
From a communications standpoint, FCC-compliant system design must also consider interoperability and coexistence. In a vehicle packed with Bluetooth, Wi-Fi, GPS, DSRC or C-V2X, and cellular modules, maintaining RF harmony requires careful frequency planning, shielding, and filtering. The FCC’s evolving rules for the 5.9 GHz band—reallocating portions from DSRC to C-V2X—illustrate how regulatory frameworks directly impact product architecture. OEMs must track these developments and validate that their communication modules not only operate within approved frequency bands but also do not emit spurious signals that could violate FCC emission ceilings. To meet FCC standards while ensuring high system reliability, automotive developers must embed EMI considerations early in the design cycle. Pre-compliance testing, EMI-aware PCB layout, and component-level certification all contribute to a smoother path to regulatory approval. Moreover, aligning FCC requirements with international automotive EMC standards—like CISPR 25 and UNECE R10—helps ensure global market readiness. As vehicles grow increasingly software-defined, connected, and autonomous, managing EMI through smart engineering and regulatory foresight will be a critical enabler of innovation, safety, and compliance.
As discussed, FCC regulations are primarily focused on electromagnetic interference. However, if RF energy has the potential to cause health issues, other regulators are involved. Health and safety regulation for FCC Part 18 devices—such as microwave ovens and medical RF equipment—is primarily handled by agencies. The Food and Drug Administration (FDA) oversees radiation-emitting electronic products to ensure they meet safety standards for human exposure, particularly for consumer appliances and medical devices. The Occupational Safety and Health Administration (OSHA) establishes workplace safety limits for RF exposure to protect employees who operate or work near such equipment. Meanwhile, the National Institute for Occupational Safety and Health (NIOSH) conducts research and provides guidance on safe RF exposure levels in occupational settings. While the FCC regulates RF emissions from Part 18 devices to prevent interference with licensed communication systems, it relies on these other agencies to ensure that the devices do not pose health risks to users or workers.
In the case of vehicle makers, part 18 health issues manifest themselves in use-models such as wireless power delivery where SAR levels may impact safety directly.
Finally, while the examples used above are from a US context, similar structures exist in all other geographies.
In the last decade, the airborne sector has layered autonomy and advanced sensing on top of this foundation. Modern UAVs and advanced air mobility platforms integrate sensor fusion processors, vision systems, and AI accelerators for detect-and-avoid and autonomous navigation. Commercial transports incorporate enhanced vision systems, predictive maintenance analytics, and increasingly software-defined capabilities. However, unlike automotive’s rapid consumer-driven scaling, airborne electronics remain constrained by certification timelines, long product lifecycles (20–30+ years), and extreme environmental requirements (temperature, vibration, radiation).
Autonomous systems add several unique layers of complexity to both hardware integration and supply chain management:
Multi-Vendor Dependency A single autonomous platform may use components from dozens of vendors — from AI accelerators to GNSS modules. Managing version control, firmware updates, and hardware compatibility across this ecosystem requires multi-tier coordination and continuous configuration tracking [55].
Safety-Critical Certification Hardware must meet safety and regulatory certifications, such as:
Each certification adds cost, time, and documentation requirements.
Real-Time and Deterministic Performance Integration must guarantee low-latency, deterministic behaviour — meaning that sensors, processors, and actuators must communicate within microsecond precision. This influences hardware selection and network design [56].
Rapid Technology Obsolescence AI and embedded computing evolve faster than mechanical systems. Components become obsolete before the platform’s lifecycle ends, forcing supply chains to manage technology refresh cycles and long-term component availability planning [57].
Possible Solutions and Best Practices
The most important challenges and possible solutions are summarized in the following table:
| Challenge | Solution / Mitigation Strategy |
|---|---|
| Component Shortages | Multi-sourcing strategies and localized fabrication partnerships. EU’s Chip Act is a good example of securing future supplies. |
| Supplier QA Variance | Supplier qualification programs and continuous audit loops. |
| Cybersecurity Risks | Hardware attestation, firmware signing, and supply chain transparency tools (e.g., SBOMs). |
| Ethical Sourcing | Traceable material chains via blockchain and sustainability certification. |
| Obsolescence | Lifecycle management databases (e.g., Siemens Teamcenter, Windchill). |
| Integration Complexity | Use of standardized hardware interfaces (CAN-FD, Ethernet TSN, PCIe). |
Typical Supply Chain Management (SCM) Approaches Strategic Partnerships and Vertical Integration
Many companies are moving toward vertical integration, controlling multiple stages of the supply chain. For instance:
This approach increases supply security and reduces dependency on third parties, though it requires substantial capital investment.
Sustainability and Ethical SCM
Sustainability in supply chains focuses on reducing carbon footprint, ensuring ethical sourcing, and promoting recyclability [65]. Key practices:
Effective hardware integration and supply chain management are tightly interwoven. Integration depends on having high-quality, compatible components, while supply chains rely on robust feedback from integration and testing to forecast needs, reduce waste, and maintain reliability. Modern SCM frameworks, particularly Lean, Agile, and Digital models, offer strategies to make the autonomy industry more resilient, sustainable, and responsive.
This chapter explains how semiconductors and electronics became the foundation of modern autonomous systems across ground, airborne, marine, and space platforms. It shows a common historical pattern: systems began with mostly mechanical or isolated electronic functions, then evolved toward digitized control, networked subsystems, and increasingly autonomous operation. In cars, this meant moving from engine control to chassis, infotainment, electrification, and ADAS; in aircraft, ships, and spacecraft, it meant a similar shift from stand-alone avionics or navigation aids to integrated, safety-critical digital architectures.
The chapter also emphasizes that autonomy is not just a matter of adding sensors. It requires a full ecosystem of hardware, computation, validation, and governance. Different domains rely on different sensor mixes—such as radar, cameras, LiDAR, GNSS, IMUs, sonar, or star trackers—but all must fuse data and convert it into safe decisions in real time. Because these systems are safety-critical, the chapter highlights the importance of standards such as ISO 26262, IEC 61508, and DO-254, along with validation processes that include calibration, timing analysis, scenario-based testing, simulation, and structured safety cases.
Finally, the chapter argues that successful autonomous systems depend on more than technical performance: they must also navigate EMI regulation, health and safety oversight, and resilient supply chains. The discussion covers FCC spectrum and emissions compliance, EMC testing, and the role of accredited labs, then moves into supply-chain challenges such as component scarcity, cybersecurity, certification burdens, ethical sourcing, and technology obsolescence. The main takeaway is that autonomous systems are not just advanced machines—they are complex, tightly integrated products whose success depends on coordinated progress in electronics, sensing, safety, validation, and supply chain management.
What is Software?
Programmable Hardware and the Emergence of Software Systems
The previous chapter introduced electronic hardware and the role of electronic components in implementing system functionality. However, the physical nature of hardware—and the inherent complexity of designing across mechanical, electrical, and logical domains—places fundamental limits on the speed and flexibility with which new system capabilities can be developed. To address these limitations, hardware platforms evolved to support programmability after fabrication. This programmability enables a separation between physical implementation and functional behavior, allowing systems to be adapted without redesigning the underlying hardware.
These programming paradigms introduce several important system-level considerations:
The concept of programmable hardware was significantly advanced in the 1960s with the introduction of the IBM System/360, which formalized the notion of a stable computer architecture. This development marked a critical transition from device-specific design to platform-based computing and introduced several enduring properties:
Since the introduction of computer architectures in the 1960s, rapid advances in semiconductor technology, system design, and networking have driven an exponential expansion in computing capability. These developments have transformed nearly every aspect of modern society through what is broadly referred to as information technology. The programming of these systems—spanning configuration, control, and application logic—is collectively known as software.
Open-source systems have played a transformative role in the evolution of information technology by accelerating innovation, lowering barriers to entry, and standardizing software infrastructure across heterogeneous environments. Foundational platforms such as Linux, the Apache HTTP Server, and languages and ecosystems such as Python and GCC enabled a global, collaborative development model in which individuals, academia, and industry could contribute to shared software stacks. This model fostered rapid iteration, transparency, and portability, allowing software to scale from individual machines to cloud-scale distributed systems. Open-source licensing also enabled companies to build commercial products atop shared infrastructure, leading to the emergence of entire ecosystems around cloud computing, data analytics, and artificial intelligence. As a result, open-source software became a cornerstone of modern IT, underpinning everything from web services to high-performance computing and enabling a pace of innovation that would have been difficult to achieve through proprietary development alone.
While the IT ecosystem drove massive innovations and built incredible capabilities, these capabilities could not be directly used in cyber-physical systems. Cyber-physical software differs from conventional embedded or enterprise software because it operates under strict real-time constraints and it needs robust fault tolerance and safety compliance. The historical introduction of software into cyber-physical systems followed different timelines across ground, airborne, marine, and space domains, but in all four cases the long-term trend was the same: software evolved from supporting narrow control functions to becoming the central coordinating layer for sensing, decision-making, communication, and actuation. In the earliest generation of these systems, most functionality was mechanical, hydraulic, analog, or electromechanical. As digital electronics matured, software first entered as a way to improve control precision, reduce weight, support diagnostics, and increase flexibility. Over time, however, software stopped being merely an enhancement and became essential to system operation. This shift was one of the major enablers of autonomy.
In ground systems, especially automobiles, software emerged in a practical production role during the 1970s and early 1980s, when tightening emissions regulations pushed manufacturers toward microprocessor-based engine control. Early automotive software was relatively narrow in scope, focused on ignition timing, fuel injection, and engine management. As electronics spread into anti-lock braking, traction control, airbags, steering, body electronics, and infotainment, software grew from embedded control logic into a distributed system running across many electronic control units. The later introduction of in-vehicle networks such as CAN and FlexRay further expanded software’s role, because control units now had to exchange data and coordinate across domains rather than operate as isolated devices. By the 2010s, with electrification and ADAS, software had become inseparable from perception, energy management, diagnostics, communications, and vehicle behavior.
In airborne systems, software entered earlier and under stricter safety expectations because avionics quickly became tied to navigation, stability, and flight control. Early aircraft electronics were largely analog and federated, but the move to digital control accelerated in the 1970s and 1980s, culminating in the rise of fly-by-wire systems. NASA notes that its F-8 Digital Fly-By-Wire aircraft became, on May 25, 1972, the first aircraft to fly completely dependent on an electronic flight-control system, marking a major turning point in the acceptance of software within the control loop. Later developments such as glass cockpits, FADEC, and integrated avionics made software central not only to control, but also to displays, redundancy management, fault monitoring, and mission systems. Because software was trusted with flight-critical functions so early, airborne systems also developed rigorous assurance frameworks earlier than most other sectors.
In marine systems, software was introduced more gradually and often first appeared as an aid to navigation, propulsion monitoring, and ship management rather than as the immediate core of vessel control. During the 1980s and 1990s, software became increasingly important through GPS integration, electronic charting, digital propulsion governors, alarm monitoring, and networking standards such as NMEA 0183 and NMEA 2000. As ships adopted Integrated Bridge Systems and Integrated Platform Management Systems, software took on a more integrative role, connecting radar, sonar, charting, safety alerts, and propulsion information into shared consoles and coordinated workflows. The marine sector generally moved more slowly than aerospace or automotive because of lower production volumes, long vessel lifecycles, and a historically stronger dependence on mechanical and human-operated systems. Still, the same underlying pattern emerged: software shifted from assisting operators to structuring the flow of information and control across the vessel.
In space systems, software became important very early because spacecraft had to function with limited or delayed human intervention. Even early missions required onboard digital logic for guidance, control, telemetry, and fault management. Apollo is a landmark example: NASA records describe the Apollo primary guidance, navigation, and control system as centered on the Apollo Guidance Computer, making software a mission-critical part of spacecraft operation during the 1960s lunar program. In later decades, spacecraft software expanded to support attitude control, payload operation, onboard data handling, autonomous fault detection, and increasingly software-defined mission behavior. Modern space systems add reconfigurable payloads, autonomous navigation, and onboard AI, but the historical pattern remains continuous: because space systems operate remotely and under extreme constraints, software has long been essential not just for convenience, but for basic mission survival and autonomy.
As software methods migrated from traditional computing into cyber-physical systems (CPS), a distinct class of software infrastructure emerged to manage the tight coupling between computation and the physical world. Central to this evolution was the adoption of real-time operating systems (RTOSes), which provide deterministic task scheduling, bounded interrupt latency, and predictable timing behavior—properties essential for interacting with sensors, actuators, and control loops. Unlike general-purpose operating systems, RTOSes are designed to guarantee that critical tasks execute within strict temporal constraints, often using priority-based preemptive scheduling and carefully managed resource sharing. Representative RTOS implementations include VxWorks, widely used in aerospace and defense systems; QNX, common in automotive and industrial platforms; and FreeRTOS, broadly adopted in embedded and IoT devices. In addition to RTOS kernels, CPS software stacks increasingly incorporated device drivers, middleware for communication (e.g., message queues and publish–subscribe frameworks such as DDS), and hardware abstraction layers (HALs) to isolate application logic from platform-specific details. These components enabled modular software architectures while preserving the determinism required for control and safety. Across domains such as ground, airborne, marine, and space systems, RTOS-based architectures became foundational to system design, with domain-specific adaptations. In ground systems, automotive platforms standardized software stacks such as AUTOSAR, where RTOS scheduling supports engine control units (ECUs), braking systems (ABS), and advanced driver assistance systems (ADAS). In airborne systems, avionics platforms such as the Boeing 787 rely on partitioned RTOS environments (often based on VxWorks) to meet stringent safety certification requirements (e.g., DO-178C), ensuring temporal and spatial isolation between flight-critical functions. In marine systems, integrated bridge and navigation systems—such as those used on modern commercial vessels and naval ships—employ real-time software (often QNX-based) to coordinate radar, GPS, and autopilot control loops under standards like IEC 61162 (NMEA). In space systems, spacecraft such as the Mars Perseverance Rover utilize RTOS platforms like VxWorks to manage guidance, navigation, and control in environments where remote operation and fault tolerance are essential. Over time, these systems evolved from tightly coupled, monolithic implementations to more layered and componentized architectures, incorporating standardized interfaces and increasingly sophisticated middleware. This progression laid the groundwork for modern trends such as software-defined vehicles, autonomous systems, and distributed CPS platforms, where software not only controls physical processes but also enables continuous updates, adaptability, and higher-level system intelligence.
In cyber-physical systems (CPS), the role of open-source software has been more gradual but increasingly significant, particularly as systems have become more complex, networked, and software-defined. Platforms such as FreeRTOS, Zephyr, and middleware frameworks like ROS have enabled broader access to embedded and robotic system development, fostering innovation in domains such as autonomous vehicles, industrial automation, and drones. Open-source approaches in CPS provide advantages in transparency, flexibility, and community-driven validation, which are particularly valuable for research and prototyping. However, their adoption in safety-critical domains—such as avionics, automotive safety systems, and space missions—has required careful integration with certification processes, long-term support models, and rigorous verification and validation practices. Increasingly, hybrid models are emerging in which open-source components form the foundation of development platforms, while certified, domain-specific layers ensure compliance with safety and reliability requirements, reflecting a convergence between the open innovation model of IT and the stringent assurance needs of cyber-physical systems.
As software moved from advisory and convenience roles into closed-loop control, fault management, and autonomy, safety standards had to shift from focusing mainly on hardware reliability to addressing software behavior, development process, traceability, and verification evidence. The big historical move was this: hardware could often be analyzed in terms of random failures and wear-out mechanisms, but software introduced a different kind of risk—systematic faults from requirements errors, design flaws, implementation mistakes, and unexpected interactions. That forced each domain to build standards that emphasized lifecycle rigor, requirements traceability, verification independence, configuration control, and structured safety arguments rather than just component robustness. IEC 61508 became the broad functional-safety reference point for programmable electronic systems and explicitly includes software requirements in Part 3, while later domain-specific standards adapted that logic to their own operating environments.
In ground systems, especially automotive, the early era of software safety was relatively informal: OEMs and suppliers used internal engineering discipline, testing, and FMEA-style thinking, but there was no unified framework tailored to vehicle software. As vehicles became software-intensive—first in engine control, then braking, steering, airbags, networking, and ADAS—the industry needed a standard that treated software as part of a full safety lifecycle. That came through ISO 26262, first published in 2011 as an adaptation of IEC 61508 for road vehicles. ISO 26262 introduced Automotive Safety Integrity Levels (ASILs), hazard analysis and risk assessment, lifecycle processes, and safety measures for both hardware and software, embedding software assurance into vehicle development rather than leaving it as a late-stage test problem. In practical terms, the standard pushed the automotive industry toward stronger requirements engineering, bidirectional traceability, safer software architecture, verification planning, and formal integration of software into system-level safety cases.
In airborne systems, software safety standards emerged earlier and with greater rigor because software entered flight-critical functions sooner. Aviation could not treat software as just another engineering layer once digital flight control, navigation, and avionics displays became mission- and safety-critical. That is why DO-178, originally published in 1981, became so influential: it defined design assurance for airborne software and tied development rigor to the criticality of the function. Over time this matured through DO-178B and then DO-178C in 2011, which remains the core software assurance framework recognized by the FAA through AC 20-115D. The airborne sector’s key historical move was to make software safety depend not on testing alone, but on documented objectives, lifecycle evidence, configuration control, structural coverage, tool qualification where needed, and verification commensurate with software level. In other words, aviation moved earliest and most clearly toward the idea that safe software is demonstrated through a disciplined assurance process, not just by showing that a program “seems to work.”
In marine systems, the evolution was slower and more fragmented. Marine governance historically focused more on mechanical integrity, redundancy, seaworthiness, and prescriptive equipment rules than on software-specific lifecycle assurance. As ships adopted integrated bridge systems, dynamic positioning, digital navigation, and autonomous functions, classification societies such as DNV, ABS, and Lloyd’s Register increasingly had to account for software quality, cyber resilience, and failure behavior in control systems. But unlike aviation and automotive, the marine sector did not converge as early on a single universally dominant software-safety standard. Instead, it has generally relied on a patchwork of class rules, IEC-derived functional-safety thinking, equipment standards, and system-specific assurance practices. So the historical movement in marine has been from equipment approval and redundancy rules toward a more software-aware model, but one that still remains less unified and less process-centered than in aerospace or automotive. That difference reflects the sector’s lower production volumes, varied vessel types, long lifecycles, and less centralized certification structure. The chapter you shared captures this well in noting that marine governance has remained more prescriptive and performance-based than process-assurance-based.
In space systems, software safety evolved under extreme mission-assurance constraints rather than through a single commercial certification pathway. Space programs recognized early that software errors could be catastrophic because repair is difficult or impossible, communication delays are long, and missions are expensive. For a long time, safety was handled through agency-specific reliability doctrine, redundancy, conservative design, and system engineering discipline rather than a single software certification standard like DO-178. NASA’s own software-safety framework became more explicit with NASA-STD-8719.13, first issued in 1997 and updated since; NASA describes it as specifying the activities necessary to ensure safety is designed into software acquired or developed by the agency. The space sector’s historical movement, then, has been from mission-specific reliability practice toward more formalized software-safety activities, documentation, and risk-scaled rigor. Compared with airborne systems, the emphasis is often less on certifying a product line for repeated operation and more on ensuring that mission-specific software hazards are identified, mitigated, and managed as part of a broader system safety case.
Software entered complex engineered products long before anyone talked about “software-defined” anything. In the earliest generations of electronic products, software was small, tightly coupled to a specific hardware function, and often treated almost like firmware: a fixed control layer burned into ROM or maintained by a small engineering team. Productization in that era was primarily a hardware discipline. Once the design was frozen and qualified, the software was expected to stay stable for years, sometimes for the entire product life. Maintainability existed, but mostly in the form of patching defects, issuing service updates, and preserving compatibility with replacement hardware. The supply chain focus was similarly physical: semiconductors, boards, connectors, and mechanical parts dominated risk and planning. Software dependencies were limited enough that organizations could often understand the full stack internally. That began to change as products became networked, feature-rich, and digitally updatable.
From the 1980s through the 2000s, software became a much larger share of product value, especially in embedded systems, telecommunications, aerospace, and automotive electronics. This changed productization from a one-time release activity into an ongoing lifecycle problem. A product now had to be launched, updated, serviced, secured, and sometimes reconfigured in the field. Maintainability became more than clean code or modular design; it came to mean version control across hardware variants, traceability from requirements to deployed binaries, long-term support for aging platforms, and the ability to diagnose failures across interacting subsystems. At the same time, the software supply chain became more complex. Instead of mostly internal code, products increasingly depended on third-party operating systems, middleware, protocol stacks, compilers, libraries, vendor SDKs, and eventually open-source components. NIST now describes the software supply chain as the collection of activities involved in producing and delivering software, noting that its integrity depends on the security and discipline of those activities; modern guidance emphasizes practices such as SBOMs, vendor risk assessment, vulnerability management, and secure development frameworks. Historically, that marks a major shift: software was no longer just something a company wrote, but something it assembled, integrated, inherited, and continuously governed.
The modern phase extends this logic even further. In connected products, especially vehicles, software is now a primary means of differentiation, feature delivery, and even business model evolution. That is where the idea of the software-defined vehicle (SDV) comes in. Historically, vehicles were built around many function-specific ECUs with tightly coupled hardware and software, and new capability typically arrived only with a new model year or hardware redesign. The SDV concept reflects a move away from that paradigm toward centralized or zonal computing, richer abstraction layers, and over-the-air updatability, so that features, performance, user experience, and even some platform behavior can evolve after the vehicle is sold. Industry analysts describe this shift as part of a broader transition in automotive E/E architecture, where software and centralized computing become the core enablers of innovation and ongoing value creation. From a historical perspective, the SDV is the endpoint of a long arc: products began as hardware with a little embedded code, became integrated systems whose success depended on software lifecycle management, and are now increasingly understood as updatable software platforms embodied in hardware.
IT-based software is verified through a structured combination of requirements-based testing, code analysis, and runtime validation, augmented by principles from Carnegie Mellon University Software Engineering Institute methodologies such as the Capability Maturity Model Integration and disciplined software engineering practices. Verification begins with ensuring that requirements are well-defined, traceable, and testable—aligned with CMMI’s emphasis on requirements management and validation. Development proceeds through unit, integration, and system testing, supported by peer reviews, formal inspections, and static analysis, reflecting SEI’s focus on early defect removal and process discipline. Measurement and analysis play a key role, with metrics collected to assess defect density, coverage, and process performance. Configuration management ensures that all artifacts (code, tests, requirements) are version-controlled and reproducible, while process maturity levels guide organizations toward increasingly predictable and optimized verification practices. Continuous integration pipelines automate regression testing, and in higher-maturity environments, quantitative process control and causal analysis are used to systematically improve quality. Finally, verification extends into operations through monitoring and feedback loops, embodying the SEI philosophy of continuous process improvement across the software lifecycle.
Validation of cyber-physical software places strong emphasis on hardware/software co-verification using a spectrum of simulation and emulation techniques to ensure correct behavior before deployment in the physical world. At the earliest stages, model-in-the-loop (MIL) and software-in-the-loop (SIL) simulations evaluate control algorithms and software logic against mathematical models of the environment and plant dynamics. These are followed by hardware-in-the-loop (HIL) approaches, where real control software executes on target or representative hardware while interacting with simulated sensors, actuators, and physical processes in real time—commonly used in automotive engine control, avionics flight systems, and industrial automation. As system complexity increases, processor-in-the-loop (PIL) and full-system emulation platforms enable timing-accurate execution and validation of embedded software under realistic workloads. In semiconductor and advanced embedded domains, platforms such as QEMU and commercial FPGA-based emulators allow early software bring-up prior to silicon availability. Across these stages, validation focuses not only on functional correctness but also on timing determinism, fault handling, and interaction with physical processes. This layered approach enables progressive risk reduction, bridging the gap between abstract models and real-world deployment while supporting the stringent safety and reliability requirements of cyber-physical systems.
In summary, a dominant IT electronic ecosystem drives the fundamental rhythm of hardware and software development. Cyber-physical systems, with considerably lower volume, have had to adapt to this dominant rhythm in the following ways:
Taken together, the shift from largely mechanical systems to software defined vehicles is a massive shift in design, manufacturing, support, and even legal ownership. Software is typically licensed to the OEM and then to the final customer.
Modern autonomous systems — from self-driving cars and unmanned aerial vehicles (UAVs) to marine robots and industrial co-bots — depend fundamentally on software architectures capable of real-time sensing, decision-making, and control. While mechanical and electronic components define what a system can do, the software stack defines how it does it — how it perceives the world, interprets data, plans actions, and interacts safely with its environment [66,67]. Autonomy software differs from conventional embedded or enterprise software in several critical ways:
This combination of safety-critical engineering and AI-driven decision-making makes autonomy software one of the most challenging areas in modern computing.
Autonomy software must achieve four key functional objectives [68,69]:
Each of these objectives corresponds to distinct software layers and modules in the autonomy stack.
| Characteristic | Description | Importance |
|---|---|---|
| Real-time Execution | Must process sensor data and react within milliseconds. | Ensures safety and stability. |
| Determinism | Predictable behaviour under defined conditions. | Required for validation and trust. |
| Scalability | Supports increased sensor data and compute complexity. | Allows future upgrades. |
| Interoperability | Integrates diverse hardware, OS, and middleware. | Facilitates modularity. |
| Resilience | Must continue functioning despite partial failures. | Critical for mission continuity. |
| Adaptability | Learns from data or updates behaviour dynamically. | Key for AI-driven autonomy. |
These characteristics drive architectural decisions and the choice of frameworks (e.g., ROS, AUTOSAR Adaptive, DDS).
Autonomy software is layered, combining multiple software technologies:
The combination of these layers forms the autonomy software stack, which enables complex behaviour while maintaining reliability. A defining aspect of autonomy software is its reliance on middleware — frameworks that manage interprocess communication (IPC), data distribution, and time synchronisation across distributed computing nodes. Some of the widely used standards:
A complete software stack is a layered collection of software components, frameworks, and libraries that work together to deliver a complete set of system functionalities. Each layer provides services to the layer above it and depends on the layer below it. Middleware, which is an essential part of the multi-layered architectures, ensures that all layers of the software stack can exchange information deterministically and safely [70]. In autonomous systems, the software stack enables integration between:
It’s the backbone that allows autonomy to function as a cohesive system rather than a set of disconnected modules (Quigley et al., 2009; Maruyama et al., 2016). From a technical perspective, the software stack defines how functionality, data flow, and control are structured within the system.
Modularity and Abstraction
Each layer isolates complexity by providing a clean interface to the one above.
Real-Time and Deterministic Behaviour
Autonomous systems rely on real-time responses. The stack architecture ensures:
Interoperability
Middleware such as ROS 2 or DDS standardises interprocess communication. This allows different vendors’ software modules (e.g., LiDAR driver from Company A and planner from Company B) to work together.
Fault Tolerance and Redundancy
Stack layering supports redundant paths for safety-critical functions. If a perception node fails, a backup process may take over seamlessly — ensuring resilience, especially in aerospace and automotive systems [72].
Continuous Integration and Simulation
A layered design allows developers to:
Management and Organisational Importance
From a software engineering management perspective, a defined software stack provides structure and governance for the development process, which provides the following main advantages: Division of Labour. Teams can specialise by layer — e.g., one group handles perception, another control, another middleware. This parallelises development and allows use of domain expertise without interference.
Reusability and Version Control Reusable modules and APIs speed up development. Tools like Git, Docker, and CI/CD pipelines ensure traceability, maintainability, and fast updates across distributed teams.
Scalability and Lifecycle Management A well-structured stack can be extended with new sensors or algorithms without re-architecting the entire system. Lifecycle management tools (e.g., ROS 2 launch systems, AUTOSAR Adaptive manifests) maintain version consistency and dependency control.
Quality Assurance (QA) and Certification Layered software stacks make it easier to apply quality control and compliance frameworks, such as: ISO 26262 (Automotive safety software), DO-178C (Aerospace software) or IEC 61508 (Functional safety in automation). Each layer can be validated separately, simplifying documentation and certification workflows.
Cost and Risk Reduction When multiple projects share a unified software stack, the cost of testing, validation, and maintenance drops significantly. This approach underpins industry-wide initiatives like AUTOSAR, which standardises vehicle software to lower integration costs.
The Layered Stack as an Organisational Blueprint
In large autonomy projects (e.g., Waymo, Tesla), the software stack also serves as an organisational structure. Teams are aligned with layers:
Thus, the software stack doubles as both a technical architecture and an organisational map for coordination and accountability [73].
Real-World Example: ROS 2 as a Layered Stack
The Robot Operating System 2 (ROS 2) exemplifies how modular software stacks are implemented:
This layered model has become the foundation for numerous autonomous systems in academia and industry — from mobile robots to autonomous vehicles [74]).
Advantages of a Well-Defined Software Stack
| Advantage | Description |
|---|---|
| Clarity and Structure | Simplifies system understanding and onboarding. |
| Parallel Development | Enables multiple teams to work concurrently. |
| Interchangeability | Supports component replacement without total redesign. |
| Scalability | Allows future expansion with minimal rework. |
| Maintainability | Facilitates debugging, upgrades, and certification. |
| Efficiency | Reduces cost, redundancy, and integration risk. |
In essence, a software stack is not merely a technical artefact — it’s a strategic enabler that aligns engineering processes, organisational structure, and long-term sustainability of autonomous platforms. Autonomy software stack and Development and Maintenance challenges are discussed in the following chapters.
The software lifecycle defines the complete process by which software is conceived, developed, deployed, maintained, and eventually retired. In the context of modern engineering — particularly for complex systems such as autonomous platforms, embedded systems, or enterprise solutions — understanding the lifecycle is essential to ensure quality, reliability, and maintainability. The lifecycle acts as a roadmap that guides project teams through stages of development and management. Each stage defines specific deliverables, milestones, and feedback loops, ensuring that the software evolves in a controlled, traceable, and predictable way 8).
“The software lifecycle refers to a structured sequence of processes and activities required to develop, maintain, and retire a software system.” — 9) In other words, the lifecycle describes how a software product transitions from idea to obsolescence — incorporating all the engineering, management, and maintenance steps along the way. The lifecycle ensures:
In regulated domains like aerospace, automotive, and medical devices, adherence to a defined lifecycle is also a legal requirement for certification and compliance (e.g., ISO/IEC 12207, DO-178C, ISO 26262).
Different industries and projects adopt specific lifecycle models based on their goals, risk tolerance, and team structure. The most widely used models are explained in this chapter.
The Waterfall Model is one of the earliest and most widely recognised software lifecycle models. It follows a linear sequence of stages where each phase must be completed before the next begins 10).
Advantages:
Limitations:
An evolution of the waterfall approach, the V-Model emphasises testing and validation at each development stage. Each “downward” step (development) has a corresponding “upward” step (testing/validation).
Advantages:
Limitations:
Instead of completing the whole system in one sequence, the iterative model develops the product through multiple cycles or increments. Each iteration delivers a working version that can be reviewed and refined. Advantages:
Limitations:
Agile development (e.g., Scrum, Kanban, Extreme Programming) emphasises collaboration, adaptability, and customer feedback. It replaces rigid processes with iterative cycles known as sprints.
Core Principles 11):
Advantages:
Challenges:
Introduced by Boehm 12), the Spiral Model combines iterative development with risk analysis. Each loop of the spiral represents one phase of the process, with risk evaluation at its core.
Advantages:
Limitations:
Modern systems increasingly adopt DevOps — integrating development, testing, deployment, and operations into a continuous cycle. This model leverages automation, CI/CD pipelines, and cloud-native
Advantages:
Challenges:
| Model | Main Focus | Advantages | Best Suited For |
|---|---|---|---|
| Waterfall | Sequential structure | Simple, predictable | Small or regulated projects |
| V-Model | Verification and validation | Traceable, certifiable | Safety-critical systems |
| Iterative/Incremental | Progressive refinement | Flexible, early testing | Complex evolving systems |
| Agile | Collaboration & feedback | Fast adaptation, user-centric | Software startups, dynamic projects |
| Spiral | Risk-driven development | Risk control, scalability | Large R&D projects |
| DevOps | Continuous integration | Automation, rapid delivery | Cloud, AI, or autonomous platforms |
In software engineering, Configuration Management (CM) refers to the systematic process of identifying, organising, controlling, and tracking all changes made to a software system throughout its lifecycle. It ensures that:
According to ISO/IEC/IEEE 828:2012, CM is defined as: “A discipline applying technical and administrative direction and surveillance to identify and document the functional and physical characteristics of a configuration item, control changes to those characteristics, and record and report change processing and implementation status.”
In other words, Configuration Management keeps the software stable while it evolves. Configuration management exists to:
To understand CM, several foundational terms must be defined.
Configuration Item (CI) A Configuration Item is any component of the system that is subject to configuration control. Examples include:
Each CI is uniquely identified, versioned, and tracked over time 15).
Baseline A baseline is a formally approved version of one or more configuration items that serves as a reference point. Once established, any changes to the baseline must follow a defined change control process. Types of baselines:
Baselines create stability checkpoints in the lifecycle 16).
Version Control Version control systems (VCS), such as Git, Mercurial, or Subversion, track and manage modifications to source code and other files. They enable:
Version control forms the technical backbone of configuration management.
Change Management Change management defines how modifications are proposed, evaluated, approved, and implemented. Typical steps:
This structured approach ensures accountability and quality control 17).
Configuration Audit A configuration audit verifies that the configuration items and documentation:
Two common types:
Audits maintain integrity and compliance, especially in defence and aerospace projects 18).
Even though CM brings structure and order, it faces numerous practical challenges, particularly in distributed and complex systems.
Complexity and Scale Modern systems can contain millions of lines of code, hundreds of dependencies, and multiple configurations for different platforms. Managing all these variations manually is infeasible. Example: An autonomous vehicle might include distinct configurations for:
Solution: Automated configuration management with metadata-driven tools (e.g., Ansible, Puppet, Kubernetes Helm).
Multiple Development Streams In large projects, teams work on multiple branches or versions simultaneously (e.g., development, testing, release). This increases the risk of:
Solution:
Hardware–Software Interdependencies In embedded or cyber-physical systems, configurations depend on hardware variants (processors, sensors, memory). Maintaining alignment between software builds and hardware specifications is difficult. Mitigation:
Frequent Updates and Continuous Delivery In the DevOps era, software may be updated multiple times per day across thousands of devices. Each update must maintain consistency and rollback capability. Challenge:
Solution:
Data and Configuration Drift Configuration drift occurs when the system’s actual state deviates from its documented configuration — common in dynamic, cloud-based systems. Causes:
Prevention:
Regulatory and Compliance Demands In domains like aerospace, medical, and automotive, configuration management is a compliance requirement under standards such as ISO/IEC/IEEE 12207, ISO 26262 or IEC 61508 Challenge:
Solution:
Human and Organisational Factors The most difficult aspect of CM is often cultural, not technical. Teams may resist documentation or formal change control due to perceived bureaucracy. As a result:
Solution:
Configuration management (CM) is not a single activity but a cyclic process integrated into the entire software lifecycle. The ISO/IEC/IEEE 828:2012 standard identifies four principal activities:
In modern practice, a fifth step — Configuration Verification and Review — is also added for continuous improvement and compliance.
Configuration Identification The first step in CM defines what needs to be managed. It involves:
Example hierarchy:
Tools & Techniques:
Goal: Create a clear inventory of every managed artefact and its dependencies.
Tools and Techniques:
Goal: Ensure that every change is reviewed, justified, and properly recorded before being implemented.
Configuration Status Accounting (CSA) CSA provides visibility into the current state of configurations across the project. It records which versions of CIs exist, where they are stored, and what changes have occurred. Typical outputs include:
Tools & Techniques:
Goal: Provide transparency and traceability, so project managers and auditors can reconstruct the exact configuration of any product version at any point in time.
Configuration Audit A Configuration Audit ensures the product conforms to its baseline and that all changes were properly implemented and documented. It verifies:
There are two types:
Tools & Techniques:
Goal: Ensure integrity, consistency, and compliance across the entire configuration baseline.
Configuration Review and Verification This optional step closes the CM loop. It assesses whether CM processes are effective and aligned with project objectives. Activities include:
Tools:
Goal: Support continuous improvement and process optimisation.
Modern CM relies heavily on automation and integration tools to manage complexity and enforce discipline across teams. These tools can be categorized by function.
Version Control Systems (VCS)
| Tool | Description | Example Use |
|---|---|---|
| Git | Distributed version control system; supports branching and merging. | Used for nearly all modern software projects. |
| Subversion (SVN) | Centralised version control with strict change policies. | Preferred in regulated environments (aerospace, defence). |
| Mercurial | Similar to Git, optimised for scalability and ease of use. | Used in research or large repositories. |
Build and Continuous Integration Tools
| Tool | Purpose | Example Use |
|---|---|---|
| Jenkins / GitLab CI | Automate building, testing, and deploying changes. | Trigger builds after commits or merge requests. |
| Maven / Gradle / CMake | Manage project dependencies and build processes. | Ensure reproducible builds. |
| Docker / Podman | Containerise environments for consistency. | Package applications with dependencies for testing and deployment. |
Infrastructure and Environment Management
| Tool | Function | Application |
|---|---|---|
| Ansible / Puppet / Chef | Automate configuration and provisioning. | Keep server environments synchronised. |
| Terraform | Infrastructure as Code (IaC) for cloud platforms. | Manage cloud resources with version control. |
| Kubernetes Helm | Manages container-based deployments. | Controls configurations in microservice architectures. |
Artifact and Release Management
| Tool | Purpose | Example Use |
|---|---|---|
| JFrog Artifactory / Nexus Repository | Store and version compiled binaries, libraries, and Docker images. | Maintain reproducibility of releases. |
| Spinnaker / Argo CD | Manage continuous deployment to production environments. | Implement automated rollouts and rollbacks. |
Configuration Tracking and Documentation
| Tool | Purpose | Use Case |
|---|---|---|
| ServiceNow CMDB | Tracks configuration items, dependencies, and incidents. | Enterprise-scale CM. |
| Atlassian Confluence | Maintains documentation and process records. | Collaboration and change documentation. |
| Polarion / IBM DOORS | Links requirements to configuration items and test results. | Traceability in regulated environments. |
Example – An integrated CM Workflow:
Toolchain Integration for Autonomous Systems In autonomous platforms (e.g., UAVs, vehicles), CM tools are often integrated with:
This hybrid approach ensures consistent software across all nodes — from cloud services to embedded controllers 23).
Even mature organisations often encounter challenges in lifecycle and configuration management:
| Pitfall | Effect | Mitigation |
|---|---|---|
| Poor version control discipline | Loss of traceability | Enforce the branching strategy and pull request reviews. |
| Incomplete configuration audits | Undetected inconsistencies | Automate audit workflows and compliance scanning. |
| Manual deployment processes | Environment drift | Use CI/CD and Infrastructure as Code. |
| Siloed documentation | Lack of visibility | Centralise records using CMDB or ALM platforms. |
| Lack of cultural adoption | Resistance to process discipline | Provide training, incentives, and leadership support. |
Organisations that succeed in embedding CM practices view them not as bureaucracy, but as enablers of reliability and trust.
A typical autonomy software stack is organised into hierarchical layers, each responsible for a specific subset of functions — from low-level sensor control to high-level decision-making and fleet coordination. Although implementations differ across domains (ground, aerial, marine), the core architectural logic remains similar:
This layered design aligns closely with both robotics frameworks (ROS 2) and automotive architectures (AUTOSAR Adaptive).
In Figure 1, the main software layers and their functions are depicted.
Hardware Abstraction Layer (HAL) The HAL provides standardised access to hardware resources. It translates hardware-specific details (e.g., sensor communication protocols, voltage levels) into software-accessible APIs. This functionality typically includes:
HAL ensures portability — software modules remain agnostic to specific hardware vendors or configurations 26).
Operating System (OS) and Virtualisation Layer The OS layer manages hardware resources, process scheduling, and interprocess communication (IPC) as well as real-time operation, alert and trigger raising using watchdog processes. Here, data processing parallelisation is one of the keys to ensuring resources for time-critical applications. Autonomous systems often use:
Time-Sensitive Networking (TSN) extensions and PREEMPT-RT patches ensure deterministic scheduling for mission-critical tasks 27).
Middleware / Communication Layer The middleware layer serves as the data backbone of the autonomy stack. It manages communication between distributed software modules, ensuring real-time, reliable, and scalable data flow. IN some of the mentioned architectures middleware is the central distinctive feature of the architecture. Popular middleware technologies:
Control & Execution Layer The control layer translates planned trajectories into actuator commands while maintaining vehicle stability. It closes the feedback loop between command and sensor response. Key modules:
Safety-critical systems often employ redundant controllers and monitor nodes to prevent hazardous conditions 28).
Autonomy Intelligence Layer This is the core of decision-making in the stack. It consists of several interrelated subsystems:
| Subsystem | Function | Example Techniques / Tools |
|---|---|---|
| Perception | Detect and classify objects, lanes, terrain, or obstacles. | CNNs, LiDAR segmentation, sensor fusion. |
| Localization | Estimate position relative to a global or local map. | SLAM, GNSS, Visual Odometry, EKF. |
| Planning | Compute feasible, safe paths or behaviours. | A*, D*, RRT*, Behavior Trees. |
| Prediction | Provide the environmental behaviour forecast. Usually, it provides an internal dynamics forecast as well. | Recurrent Neural Networks, Bayesian inference. |
| Decision-making | Choose actions based on mission goals and context. | Finite State Machines, Reinforcement Learning. |
These components interact through middleware and run either on edge computers (onboard) or cloud-assisted systems for extended processing 29).
Application & Cloud Layer At the top of the stack lies the application layer, which extends autonomy beyond individual vehicles:
Frameworks like AWS RoboMaker, NVIDIA DRIVE Sim, and Microsoft AirSim bridge onboard autonomy with cloud computation.
Autonomy systems rely on data pipelines that move information between layers in real time.
Each stage includes feedback loops to ensure error correction and safety monitoring 30) 31).
ROS 2-Based Stack (Research and Prototyping)
AUTOSAR Adaptive Platform (Automotive)
MOOS-IvP (Marine Autonomy)
Hybrid Cloud-Edge Architectures
This closed-loop data exchange ensures real-time responsiveness, robust error recovery, and cross-module coherence.
Developing and maintaining an autonomous software stack is a long-term, multidisciplinary endeavour. Unlike conventional software, autonomy stacks must handle:
These constraints make the software lifecycle for autonomy uniquely complex — spanning from initial research prototypes to industrial-grade, certified systems.
Even with knowledge of autonomous software stacks, their development is still associated with significant and challenging problems. Through their mitigation and applications of different solutions, the autonomous systems become both expensive to design and develop as well as hard to maintain. The following are the most significant challenges.
Real-Time Performance and Determinism Autonomous systems require deterministic behaviour: decisions must be made within fixed, guaranteed time frames. However, high computational demands from AI algorithms often conflict with real-time guarantees 34). Key Issues:
Timing mismatches across sensor and control loops. Mitigation:
Scalability and Software Complexity As systems evolve, the number of nodes, processes, and data streams grows exponentially. For instance, a modern L4 autonomous vehicle may contain >200 software nodes exchanging gigabytes of data per second. Problems:
Solutions:
Integration of AI and Classical Control AI-based perception and classical control must coexist smoothly. While AI modules (e.g., neural networks) handle high-dimensional perception, classical modules (e.g., PID, MPC) ensure predictable control. Challenge:
Best Practices:
Safety, Verification, and Certification Autonomous systems must conform to standards like the mentioned ISO 26262 (automotive functional safety), DO-178C (aerospace software certification) and IEC 61508 (industrial safety). Challenges:
Emerging Solutions:
Cybersecurity and Software Integrity Autonomous platforms are connected via V2X, cloud APIs, and OTA updates — creating multiple attack surfaces 37). Risks:
Countermeasures:
Continuous Maintenance and Updates Unlike static embedded systems, autonomy software evolves continuously. Developers must maintain compatibility across versions, hardware platforms, and fleets already deployed in the field. Maintenance Practices:
Data Management and Scalability AI-driven autonomy relies on vast datasets for training, simulation, and validation. Managing, labelling, and securing this data is an ongoing challenge 40). Issues:
Approaches:
Human–Machine Collaboration and Ethical Oversight Autonomy software doesn’t exist in isolation — it interacts with human operators, passengers, and society. Thus, software design must incorporate transparency, accountability, and explainability. Key Considerations:
The software lifecycle typically follows a continuous evolution model:
| Phase | Purpose | Typical Tools |
|---|---|---|
| Design and Simulation | Define architecture, run models, and simulate missions. | MATLAB/Simulink, Gazebo, CARLA, AirSim. |
| Implementation and Integration | Develop and combine software modules. | ROS 2, AUTOSAR, GitLab CI, Docker. |
| Testing and Validation | Perform SIL/HIL and system-level tests. | Jenkins, Digital Twins, ISO safety audits. |
| Deployment | Distribute to field systems with OTA updates. | Kubernetes, AWS Greengrass, Edge IoT. |
| Monitoring and Maintenance | Collect telemetry and update models. | Prometheus, Grafana, ROS diagnostics. |
The goal is continuous evolution with stability, where systems can adapt without losing certification or reliability.
Both the automotive and airborne spaces have reacted to AI by viewing it as “specialized Software” in standards such as ISO 8800 [14] and [13]. This approach has the great utility of leveraging all the past work in generic mechanically safety and past work in software validation. However, now, one must manage the issue of how to handle the fact that we have a data generated “code” vs conventional programming code. In the world of V&V, this difference is manifested in three significant aspects: coverage analysis, code reviews, and version control.
| V&V Technique | Software | AI/ML |
|---|---|---|
| Coverage Analysis | Code Structure provides basis of coverage | No structure |
| Code Reviews | Crowd source expert knowledge | No Code to Review |
| Version Control | Careful construction/release | Very Difficult with data |
These differences generate an enormous issue for intelligent test generation and any argument for completeness. This is an area of active research, and two threads have emerged: 1) Training Set Validation: Since the final referenced component is very hard to analyze, one approach is to examine the training set and the ODD to find interesting tests which may expose the cracks between them [16]. 2) Robustness to Noise: Either through simulation or using formal methods [17], the approach is to assert various higher-level properties and use these to test the component. An example in object recognition might be to assert the property that an object should be recognized independent of orientation. Overall, developing robust methods for AI component validation is quite an active and unsolved research topic for “fixed” function AI components. That is, AI components where the function is changing with active version control. Of course, many AI applications prefer a model where the AI component is constantly morphing. Validating the morphing situation is a topic of future research.
For well-defined systems with an availability of system level abstractions, AI/ML components significantly increase the difficulty of intelligent test generation. With a golden spec, one can follow a structured process to make significant progress in validation and even gate the AI results with conventional safeguards. Unfortunately, one of the most compelling uses of AI is to employ it in situations where the specification of the system is not well defined or not viable using conventional programming. In these Specification Less /ML (SLML) situations, not only is building interesting tests difficult, but evaluating the correctness of the results creates further difficulty. Further, most of the major systems (perception, location services, path planning, etc.) in autonomous vehicles fall into this category of system function and AI usage. To date, there have been two approaches to attack the lack of specification problem: Anti-Spec and AI-Driver. 1) Anti-Spec In these situations, the only approach left is to specify correctness through an anti-spec. The simplest anti-spec is to avoid accidents. Based on some initial work by Intel, there is a standard, IEEE 2846, “Assumptions for Models in Safety-Related Automated Vehicle Behavior” [18] which establishes a framework for defining a minimum set of assumptions regarding the reasonably foreseeable behaviors of other road users. For each scenario, it specifies assumptions about the kinematic properties of other road users, including their speed, acceleration, and possible maneuvers. Challenges include an argument for completeness, a specification for the machinery for checking against the standard, and the connection to a liability governance framework. 2) AI-Driver While IEEE 2846 comes from a bottom-up technology perspective, Koopman/Widen [19] have proposed the concept of defining an AI driver which must replicate all the competencies of a human driver in a complex, real-world environment. Key points of Koopman’s AI driver concept include:
a) Full Driving Capability: The AI driver must handle the entire driving task, including perception (sensing the environment), decision-making (planning and responding to scenarios), and control (executing physical movements like steering and braking). It must also account for nuances like social driving norms and unexpected events. b) Safety Assurance: Koopman stresses that AVs need rigorous safety standards, similar to those in industries like aviation. This includes identifying potential failures, managing risks, and ensuring safe operation even in the face of unforeseen events. c) Human Equivalence: The AI driver must meet or exceed the performance of a competent, human driver. This involves adhering to traffic laws, responding to edge cases (rare or unusual driving scenarios), and maintaining situational awareness at all times. d) Ethical and Legal Responsibility: An AI driver must operate within ethical and legal frameworks, including handling situations that involve moral decisions or liability concerns. e) Testing and Validation: Koopman emphasizes the importance of robust testing, simulation, and on-road trials to validate AI driver systems. This includes covering edge cases, long-tail risks, and ensuring that systems generalize across diverse driving conditions. Overall, it is a very ambitious endeavor and there are significant challenges to building this specification of a reasonable driver. First, the idea of a “reasonable” driver is not even well encoded on the human side. Rather, this definition of “reasonableness” is built over a long history of legal distillation, and of course, the human standard is built on the understanding of humans by other humans. Second, the complexity of such a standard would be very high and it is not clear if it is doable. Finally, it may take quite a while of legal distillation to reach some level of closure on a human like an “AI-Driver.” Currently, the state-of-art for specification is relatively poor for both ADAS and AV. ADAS systems, which are widely proliferated, have massive divergences in behavior and completeness. When a customer buys ADAS, it is not entirely clear what they are getting. Tests by industry groups such as AAA, consumer reports, and IIHS have shown the significant shortcomings of existing solutions [20]. In 2024, IIHS introduced a ratings program to evaluate the safeguards of partial driving automation systems. Out of 14 systems tested, only one received an acceptable rating, highlighting the need for improved measures to prevent misuse and ensure driver engagement [21]. Today, there is only one non process oriented regulation in the marketplace, and this is the NHTSA regulations around AEB [22].
This chapter traces the evolution of software from programmable hardware foundations to a dominant force in modern computing systems. Early advances in hardware programmability—through configuration, programmable logic (e.g., FPGAs), and stored-program processors—enabled a separation between physical implementation and functional behavior. The introduction of stable computer architectures (notably IBM System/360) and operating systems created enduring abstractions that allowed software portability, scalability, and rapid innovation. Over time, networking and open-source ecosystems further accelerated the growth of information technology, establishing software as the central driver of capability across computing platforms.
As software methods entered cyber-physical systems (CPS)—including ground, airborne, marine, and space domains—they followed a distinct trajectory shaped by real-time constraints, safety requirements, and physical interaction. Initially introduced to enhance control and diagnostics, software evolved into the core coordinating layer for sensing, decision-making, and actuation, enabling autonomy. This transition was supported by the emergence of real-time operating systems (RTOSes), middleware, and layered software architectures that ensured deterministic behavior and modularity. Across all domains, systems evolved from isolated, hardware-centric designs to distributed, software-intensive platforms, with increasing reliance on standardized frameworks and communication protocols.
The chapter further highlights how software has transformed product development, supply chains, and validation practices. Cyber-physical systems are increasingly influenced by the faster-moving IT ecosystem, adopting open-source components, layered stacks, and continuous update models (e.g., software-defined vehicles). At the same time, safety standards (e.g., ISO 26262, DO-178C) and rigorous verification methods—such as hardware/software co-simulation (MIL, SIL, HIL)—have evolved to address the risks of software-driven behavior. Modern software supply chains are complex, incorporating third-party and open-source dependencies, requiring strong configuration management, traceability, and cybersecurity practices. Overall, the chapter emphasizes a fundamental shift: engineered systems are no longer hardware products with embedded software, but increasingly software platforms embodied in hardware.
| Stack Framework | Type | Core Covered Layers | Key Technologies | Domain Focus | Notes / Differentiation |
|---|---|---|---|---|---|
| ROS 2 | Open-source middleware stack | Middleware, application | DDS, nodes, topics, Gazebo, RViz | Robotics, AV | De facto R&D standard; highly modular |
| AUTOSAR Adaptive | Automotive software platform | OS, middleware, apps | POSIX OS, SOME/IP, service-oriented | Automotive (ADAS/AV) | Designed for ISO 26262 + OTA updates |
| AUTOSAR Classic Platform | Embedded real-time stack | HAL, RTOS, basic software | OSEK or RTOS, CAN, ECU abstraction | Automotive ECUs | Deterministic, safety-certified |
| Apollo | Full autonomy stack | Full stack (perception → control) | Cyber RT, AI models, HD maps | Autonomous driving (L2–L4) | One of the most complete open AV stacks |
| Autoware | Open AV stack | Full autonomy pipeline | ROS 2, perception, planning modules | Automotive, robotics | Strong academic + industry ecosystem |
| NVIDIA DRIVE OS | Integrated platform | OS, middleware, AI runtime | CUDA, TensorRT, DriveWorks | Automotive autonomy | Tight HW/SW co-design with GPUs |
| QNX Neutrino | RTOS middleware | OS, safety layer | POSIX RTOS, microkernel | Automotive, industrial | Strong certification (ASIL-D) |
| VxWorks | RTOS | OS, middleware | Deterministic RTOS, ARINC653 | Aerospace, defense | Widely used in safety-critical systems |
| PX4 Autopilot | UAV autonomy stack | Control, middleware, perception | MAVLink, EKF, control loops | UAV / drones | Industry standard for drones |
| ArduPilot | UAV autonomy stack | Control + navigation | Mission planning, sensor fusion | UAV, marine robotics | Broad vehicle support (air/land/sea) |
| MOOS-IvP | Marine autonomy stack | Middleware | Behavior-based robotics | Marine robotics | Optimized for low bandwidth environments |
| DDS (Data Distribution Service) | Middleware standard | Communication layer | QoS messaging, pub-sub | Cross-domain CPS | Backbone of ROS 2 and many systems |
| AWS RoboMaker | Cloud robotics stack | Cloud, simulation | DevOps, ROS integration | Robotics, AV | Enables CI/CD + simulation workflows |
| Microsoft AirSim | Simulation stack | Simulation layer | Unreal Engine, physics models | UAV, AV | High-fidelity perception simulation |
| CARLA | Simulation stack | Simulation layer | OpenDRIVE, sensors, physics | Automotive | Widely used for AV validation |
| Gazebo | Simulation stack | Simulation integration | Physics engine, ROS integration | Robotics | Standard for ROS-based systems |
The modern era of autonomy is often traced to the DARPA Grand Challenges (2004–2007), but it builds on decades of earlier automation across ground, marine, airborne, and space systems. In the airborne domain, autopilots date back to early 20th-century systems like Sperry Autopilot, evolving into today’s highly integrated flight management systems used on commercial aircraft such as the Boeing 777 and Airbus A320, where autopilot, autothrottle, and fly-by-wire systems routinely manage most phases of flight under human supervision. In the marine domain, ships have long used autopilots and dynamic positioning systems, while space systems—from the Apollo Guidance Computer to modern autonomous navigation on Mars rovers—demonstrated early closed-loop autonomy under extreme constraints. Ground systems, by contrast, lagged due to environmental complexity, which is why the DARPA challenges were so pivotal: the 2004 desert race exposed the immaturity of perception and planning, but by 2005 Stanford’s “Stanley” completed a 132-mile autonomous route, and the 2007 Urban Challenge introduced interaction with traffic, rules, and other agents. These competitions unified advances in sensing, probabilistic reasoning, and real-time control into full-stack autonomous systems and created the talent base that later drove commercial autonomy.
Previous to the DARPA challenge, deterministic algorithms were not able to make progress on important required aspects of building autonomous systems such as object recognition, path planning, or localization. The big recent leap in technology was the use of artificial intelligence to attack these previously intractable problems. The introduction of AI significantly moved field forward, but also introduced challenges.
This chapter introduces the perception, mapping, and localization in the context of autonomous vehicles and usage of different sensor modalities. It examines the determination of vehicle position, position and activities of other participants in the traffic, understanding of the surrounding scenes, scene mapping and map-keeping for navigation, applications of AI, and possible sources of uncertainty and instability.
Object detection is the fundamental perception function that allows an autonomous vehicle to identify and localize relevant entities in its surroundings. It converts raw sensor inputs into structured semantic and geometric information, forming the basis for higher-level tasks such as tracking, prediction, and planning. By maintaining awareness of all objects within its operational environment, the vehicle can make safe and contextually appropriate decisions.
Detected objects may include:
Each detection typically includes a semantic label, a spatial bounding box (2D or 3D), a confidence score, and sometimes velocity or orientation information. Accurate detection underpins all subsequent stages of autonomous behavior; any missed or false detection may lead to unsafe or inefficient decisions downstream.
Object detection relies on a combination of complementary sensors, each contributing distinct types of information and requiring specialized algorithms.
Cameras provide dense visual data with rich color and texture, essential for semantic understanding. Typical camera-based detection methods include:
Histogram of Oriented Gradients (HOG), Scale-Invariant Feature Transform (SIFT), and Speeded-Up Robust Features (SURF), used in early lane and pedestrian detection systems.Support Vector Machines (SVM) or AdaBoost combined with handcrafted features for real-time pedestrian detection.Cameras are indispensable for interpreting traffic lights, signs, lane markings, and human gestures, but their performance can degrade under low illumination, glare, or adverse weather conditions.
LiDAR (Light Detection and Ranging) measures distances by timing laser pulse returns, producing dense 3D point clouds. LiDAR-based object detection methods focus on geometric reasoning:
Euclidean Cluster Extraction and Region Growing group nearby points into potential objects. RANSAC for detecting planes, poles, or cylindrical objects. LiDAR’s precise geometry enables accurate distance and shape estimation, but sparse returns or partial occlusions can challenge classification performance.
Radar (Radio Detection and Ranging) provides long-range distance and velocity information using radio waves. Its unique Doppler measurements are invaluable for tracking motion, even in fog, dust, or darkness. Typical radar-based detection techniques include:
Radar systems are especially important for early hazard detection and collision avoidance, as they function effectively through adverse weather and poor visibility.
Ultrasonic and sonar sensors detect objects through acoustic wave reflections and are particularly useful in environments where optical or electromagnetic sensing is limited. They are integral not only to ground vehicles for close-range detection but also to surface and underwater autonomous vehicles for navigation, obstacle avoidance, and terrain mapping.
For ground vehicles, ultrasonic sensors operate at short ranges (typically below 5 meters) and are used for parking assistance, blind-spot detection, and proximity monitoring. Common methods include:
For surface and underwater autonomous vehicles, sonar systems extend these principles over much longer ranges and through acoustically dense media. Typical sonar-based detection methods include:
These acoustic systems are essential in domains where electromagnetic sensing (e.g., camera, LiDAR, radar) is unreliable — such as murky water, turbid environments, or beneath the ocean surface. Although sonar has lower spatial resolution than optical systems and is affected by multipath and scattering effects, it offers unmatched robustness in low-visibility conditions. As with other sensors, regular calibration, signal filtering, and environmental adaptation are necessary to maintain detection accuracy across varying salinity, temperature, and depth profiles.
Object detection outputs can be represented in different coordinate systems and abstraction levels:
Hybrid systems combine these paradigms—for example, camera-based semantic labeling enhanced with LiDAR-derived 3D geometry—to achieve both contextual awareness and metric accuracy.
A standard object detection pipeline in an autonomous vehicle proceeds through the following stages:
The pipeline operates continuously in real time (typically 10–30 Hz) with deterministic latency to meet safety and control requirements.
No single sensor technology can capture all aspects of a complex driving scene under all circumstances, diverse weather, lighting, and traffic conditions. Therefore, data from multiple sensors is fused (combined) to obtain a more complete, accurate, and reliable understanding of the environment than any single sensor could provide alone.
Each sensor modality has distinct advantages and weaknesses:
By fusing these complementary data sources, the perception system can achieve redundancy, increased accuracy, and fault tolerance — key factors for functional safety (ISO 26262).
Sensor fusion can be focused on complementarity – different sensors contribute unique, non-overlapping information and redundancy – overlapping sensors confirm each other’s measurements, improving reliability. As multiple sensor modalities are used, both goals can be achieved.
Accurate fusion depends critically on spatial and temporal alignment among sensors.
Calibration errors lead to spatial inconsistencies that can degrade detection accuracy or cause false positives. Therefore, calibration is treated as part of the functional safety chain and is regularly verified in maintenance and validation routines.
Fusion can occur at different stages in the perception pipeline, commonly divided into three levels:
The mathematical basis of sensor fusion lies in probabilistic state estimation and Bayesian inference. Typical formulations represent the system state as a probability distribution updated by sensor measurements. Common techniques include:
Deep learning has significantly advanced sensor fusion. Neural architectures learn optimal fusion weights and correlations automatically, often outperforming hand-designed algorithms. For example:
End-to-end fusion networks can jointly optimize detection, segmentation, and motion estimation tasks, enhancing both accuracy and robustness. However, deep fusion models require large multimodal datasets for training and careful validation to ensure generalization and interpretability.
Advances in AI, especially the convolutional neural network, allow us to process raw sensory information and recognize objects and categorize them into classes with higher levels of abstraction (pedestrians, cars, trees, etc.). Taking these categories into account allows autonomous vehicles to understand the scene and reason about future actions of the vehicle as well as about the other participants in road traffic and make assumptions on/predictions of their possible interactions. This section elaborates on the comparison of commonly used methods, their advantages, and weaknesses.
Traditional perception pipelines used hand-crafted algorithms for feature extraction and rule-based classification (e.g., edge detection, optical flow, color segmentation). While effective for controlled conditions, these systems failed to generalize to the vast variability of real-world driving — lighting changes, weather conditions, sensor noise, and unexpected objects.
The advent of deep learning revolutionized perception by enabling systems to learn features automatically from large datasets rather than relying on manually designed rules. Deep neural networks, trained on millions of labeled examples, can capture complex, nonlinear relationships between raw sensor inputs and semantic concepts such as vehicles, pedestrians, and traffic lights.
In an autonomous vehicle, AI-based perception performs several core tasks:
Deep learning architectures form the computational backbone of AI-based perception systems in autonomous vehicles. They enable the extraction of complex spatial and temporal patterns directly from raw sensory data such as images, point clouds, and radar returns. Different neural network paradigms specialize in different types of data and tasks, yet modern perception stacks often combine several architectures into hybrid frameworks.
Convolutional Neural Networks are the most established class of models in computer vision. They process visual information through layers of convolutional filters that learn spatial hierarchies of features — from edges and corners to textures and object parts. CNNs are particularly effective for object detection, semantic segmentation, and image classification tasks. Prominent CNN-based architectures used in autonomous driving include:
ResNet and EfficientNet for general feature extraction,Faster R-CNN and YOLO families for real-time object detection,U-Net and DeepLab for dense semantic segmentation.
While cameras capture two-dimensional projections, LiDAR and radar sensors produce three-dimensional point clouds that require specialized processing.
3D convolutional networks, such as VoxelNet and SECOND, discretize space into voxels and apply convolutional filters to learn geometric features.
Alternatively, point-based networks like PointNet and PointNet++ operate directly on raw point sets without voxelization, preserving fine geometric detail.
These models are critical for estimating the shape and distance of objects in 3D space, especially under challenging lighting or weather conditions.
Transformer networks, initially developed for natural language processing, have been adapted for vision and multimodal perception.
They rely on self-attention mechanisms, which allow the model to capture long-range dependencies and contextual relationships between different parts of an image or between multiple sensors.
In autonomous driving, transformers are used for feature fusion, bird’s-eye-view (BEV) mapping, and trajectory prediction.
Notable examples include DETR (Detection Transformer), BEVFormer, and TransFusion, which unify information from cameras and LiDARs into a consistent spatial representation.
Driving is inherently a dynamic process, requiring understanding of motion and temporal evolution. Recurrent Neural Networks (RNNs), particularly Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) models, are used to process sequences of observations and capture temporal dependencies. They are common in object tracking and motion prediction, where maintaining consistent identities and velocities of moving objects over time is essential. More recent architectures use temporal convolutional networks or transformers to achieve similar results with greater parallelism and stability.
Graph Neural Networks extend deep learning to relational data, representing scenes as graphs where nodes correspond to agents or landmarks and edges encode spatial or behavioral relationships.
This structure makes GNNs well suited for modeling interactions among vehicles, pedestrians, and infrastructure elements.
Models such as VectorNet, Trajectron++, and Scene Transformer use GNNs to learn dependencies between agents, supporting both scene understanding and trajectory forecasting.
Modern perception systems often combine multiple architectural families into unified frameworks. For instance, a CNN may extract image features, a point-based network may process LiDAR geometry, and a transformer may fuse both into a joint representation. These hierarchical and multimodal architectures enable robust perception across varied environments and sensor conditions, providing the high-level scene understanding required for safe autonomous behavior.
The effectiveness of 'AI-based perception' systems depends fundamentally on the quality, diversity, and management of data used throughout their development lifecycle.
As deep neural networks do not rely on explicit programming, but they learn to interpret the environment from large, annotated datasets, data becomes the foundation of reliable perception for autonomous vehicles.
Robust perception requires exposure to the full range of operating conditions that a vehicle may encounter. Datasets must include variations in:
A balanced dataset should capture both common and unusual situations to ensure that perception models generalize safely beyond the training distribution.
Because collecting real-world data for every possible scenario is impractical and almost impossible, simulated or synthetic data are often used to supplement real-world datasets.
Photorealistic simulators such as CARLA, LGSVL, or AirSim allow the generation of labeled sensor data under controlled conditions, including rare or hazardous events.
Synthetic data helps to fill gaps in real-world coverage and supports transfer learning, though domain adaptation is often required to mitigate the so-called sim-to-real gap — differences between simulated and actual sensor distributions.
Supervised learning models rely on accurately annotated datasets, where each image, frame, or point cloud is labeled with semantic information such as object classes, bounding boxes, or segmentation masks. Annotation quality is critical: inconsistent or noisy labels can propagate systematic errors through the learning process. Modern annotation pipelines combine human labeling with automation — using pre-trained models, interactive tools, and active learning to accelerate the process. High-precision labeling is particularly demanding for LiDAR point clouds and multi-sensor fusion datasets, where 3D geometric consistency must be maintained across frames.
Data used in autonomous driving frequently includes imagery of people, vehicles, and property. To comply with privacy regulations and ethical standards, datasets must be anonymized by blurring faces and license plates, encrypting location data, and maintaining secure data storage. Fairness and inclusivity in dataset design are equally important to prevent bias across geographic regions or demographic contexts.
Scene understanding is a process by which an autonomous agent interprets its environment as a coherent model — integrating environment map, objects, semantics, and dynamics into a structured representation that supports decision-making. It is the bridge between raw perception and higher-level autonomy functions such as planning, prediction, and control.
The goal of scene understanding is to transform fragmented sensor detections into a meaningful, temporally consistent model of the surrounding scene.
Scene understanding often relies on multi-layered representations:
The relational layer captures how entities within a traffic scene interact with one another and with the static environment. While lower layers (geometric and semantic) describe what exists and where it is, the relational layer describes how elements relate — spatially, functionally, and behaviorally.
Spatial relation describes e.g. mutual distance, relative velocity, and possible collision of trajectories. Functional relations describe when one entity modifies, limits, or restricts functions of another, e.g., traffic lanes modify the movement of vehicles, railing restricts the movement of pedestrians, etc.
These relations can be explicitly represented by scene graphs, where nodes represent entities and edges represent relationships, or encoded in different types of neural networks, e.g., visual-language models.
Scene understanding must maintain temporal stability across frames. Flickering detections or inconsistent semantic labels can lead to unstable planning. Techniques include temporal smoothing, cross-frame data association to maintain consistent object identities, or memory networks that preserve contextual information across time.
The temporal part of the scene understanding is tightly coupled with motion prediction and forecasting future trajectories of all dynamic agents. Two primary approaches are Physics-based models (e.g., constant-velocity, bicycle models), which are simple and interpretable, but limited in complex interactions, and learning-based models, where data-driven networks capture contextual dependencies and multiple possible futures (e.g., MultiPath, Trajectron++, VectorNet).
Designing autonomous systems which perform reliability has many design challenges. For the front-end of the AV pipeline discussed in this chapter, the challenges center around gracefully working across a range of operating conditions (ODD), performance characteristics of the sensors, and supply chain concerns.
Weather is a fundamental source of uncertainty for autonomous systems because it directly degrades sensor performance, but its impact varies significantly across ground, airborne, marine, and space domains. On the ground, rain, fog, snow, and dust can severely impair optical sensors (cameras, lidar) through scattering, attenuation, and occlusion, while also affecting radar through multipath and clutter—making perception and object classification the primary bottlenecks for autonomous vehicles. In airborne systems, weather effects such as icing, turbulence, and convective storms influence both sensing and vehicle dynamics; however, aviation benefits from structured sensing (e.g., radar, inertial systems, GPS) and well-developed weather-avoidance procedures, allowing autopilot systems to remain robust as long as hazardous regions are avoided. Marine systems face persistent challenges from sea spray, wave motion, and low-contrast environments, which degrade vision systems and introduce instability in sensor measurements, though radar and sonar provide complementary resilience. In space, traditional “weather” is absent, but analogous environmental effects—such as solar radiation, cosmic rays, and thermal extremes—impact sensor reliability and electronics, requiring radiation-hardened designs and redundancy. Across all domains, the key distinction is that weather (or its equivalent) not only reduces sensor fidelity but also increases uncertainty in state estimation and decision-making, making sensor fusion, redundancy, and probabilistic reasoning essential for maintaining safe autonomous operation.
Further, the use of electromagnetic (EM) energy in modern transportation corridors is increasing rapidly, driven by three major factors. First, the expansion of cellular networks to support continuous telecommunications for travelers has intensified ambient EM activity. Second, the widespread integration of active sensors—such as radar and LiDAR—within vehicles has introduced additional high-frequency sources. Third, infrastructure operators are deploying active sensing technologies in Roadside Units (RSUs) to enable vehicle-to-infrastructure (V2I) communication and monitoring. The resulting concentration of active EM sources is relatively well understood in the visual band with care taken for the design of highly reflective civil infrastructure as well as methods for night-time interference. However, this same care has not been done for all the sensor modalities. Especially for ground and airborne (air taxi corridors), active sensors create dense EM energy corridors which raise new challenges related to interference, coexistence, and safety which have not been characterized.
Beyond weather and EMI, sensor modalities must be complete enough to provide coverage under the constraints of the civil engineering infrastructure. Important aspects include the handling of curves, on/off ramps, bridges, tunnels, and more. For a designer there is a complex tradeoff between sensor type, number of sensors, and cost of sensors. For airborne, marine, and space systems, power and weight are also primary concerns.
Finally, because of the semiconductor business structure, cost and supply chain are intimately connected. The relationship between cost and volume in semiconductors is fundamentally shaped by high fixed costs and low marginal costs, creating powerful economies of scale. Semiconductor manufacturing requires enormous upfront investment in fabrication facilities (fabs), process development, and mask sets—often totaling billions of dollars—while the incremental cost of producing each additional chip (once the fab is running) is relatively low. As production volume increases, these fixed costs are amortized over a larger number of units, driving down the cost per chip. This dynamic is reinforced by learning curve effects (often described by Wright’s Law), where yield improvements, process optimizations, and defect reduction further reduce per-unit costs with cumulative volume. However, this relationship is not linear: advanced nodes (e.g., sub-5nm) introduce escalating mask and tooling costs that require extremely high volumes to be economically viable, while lower-volume or specialized chips (e.g., automotive, aerospace) often rely on mature nodes where costs are more stable but less aggressively optimized. As a result, the semiconductor industry exhibits a strong coupling between scale, technology node, and market demand, with leading-edge innovation economically justified primarily in high-volume applications such as consumer electronics and data center computing.
Advanced semiconductors can offer significant performance improvements in function, power, and cost. However, the economics of volume often determine whether the chip will be built. Today, the semiconductor cycle is dominated by consumer products. Automotive markets offer mid-tier volumes, and the other modalities (airborne, space, marine) are very low volume markets. The resulting design challenge is to either use advanced semiconductor chips from the consumer market, but with the limitations on safety. Alternatively, use lower-tier semiconductor chips but live with performance/power/cost/weight challenges.
Having designed a sensor, object recognition, and location services section, how does one test these components. The fundamentals are consistent with the discussions in chapter 2. One defines an ODD, builds tests underneath this ODD, applies these tests, and determines correctness. The application of the tests can virtual (simulation), physical (test track), or even a mix based on components (Hardware in Loop or Software in Loop). The population of tests needs to be complete enough to show sufficient coverage. The introduction of sensors and AI add significant complexity to this process.
Testing sensors in safety-critical systems is particularly challenging when viewed through the lens of verification, validation (V&V), and certification, because sensors are both hardware devices and context-dependent measurement systems. Verification—ensuring the sensor meets its design specifications—can be addressed through laboratory calibration, environmental stress testing, and compliance with standards such as ISO 16750 (environmental conditions), DO-160 (avionics), and MIL-STD-810 (defense systems). However, validation—ensuring the sensor performs adequately in real operational contexts—is far more complex. Sensor performance depends heavily on the operational design domain (ODD), including weather, lighting, clutter, and interference conditions, which are difficult to fully replicate or bound. This gap between controlled verification and real-world validation is especially acute for perception sensors (e.g., cameras, radar, lidar), where performance is probabilistic rather than deterministic and strongly influenced by environmental variability. Today, there is a great deal of innovation in mechanical test apparatus which mimic physical movement inside Anechoic Chambers to recreate difficult test scenarios. In the outdoor environment, hives of drones as EM sensors and producers of noise provide a similar function for test tracks.
| Conventional Algorithm | ML Algorithm | Comment |
|---|---|---|
| Logical Theory | No Theory | In conventional algorithms, one needs a theory of operation to implement the solution. ML algorithms can often work without a clear understanding of exactly why they work. |
| Analytical | Not Analytical | Conventional algorithms are accurate in a way we can understand; however, ML algorithms are not easily understood and often behave like a “black box.” |
| Causal | Correlation | Conventional algorithms focus on causality, while ML algorithms discover correlations. The difference is important if one wants to reason at higher levels. |
| Deterministic | Non-Deterministic | Conventional algorithms are deterministic in nature, and ML algorithms are fundamentally probabilistic in nature. |
| Known Computational Complexity | Unknown Computational Complexity | Given the analyzable nature of conventional algorithms, one can build a model for computational complexity. This is not always possible for ML techniques, which may require testing to evaluate computational complexity. |
Table 1: Contrast of Conventional and Machine Learning Algorithms
The introduction of AI as a replacement for traditional software introduces significant validation issues (table 1). Significantly, the many techniques developed for testing software such as code reviews, code coverage, and static analysis tools. Further, to test an AI component, it appears to be likely that one must test the method by which it was trained and have access to the training data.
Safety standards across automotive, marine, airborne, and space domains are now evolving to address the introduction of AI/ML-driven functionality, shifting from purely deterministic assurance models toward data-driven and probabilistic validation frameworks. In automotive, traditional functional safety under ISO 26262 has been extended by ISO/PAS 8800 and ISO 21448 to explicitly address perception uncertainty, training data coverage, and performance limitations of AI-based systems. In aviation, guidance such as DO-178C is being supplemented by emerging frameworks like DO-387 (in development) to tackle non-deterministic behavior, explainability, and learning assurance. Similarly, space systems governed by ECSS standards and marine systems guided by International Maritime Organization frameworks are beginning to incorporate autonomy and AI considerations, particularly for unmanned and remotely operated platforms. Across all domains, a common trend is emerging safety assurance is moving from static compliance toward lifecycle-based assurance, including dataset governance, simulation-based validation, runtime monitoring, and continuous certification concepts. This reflects a fundamental shift in safety engineering—from proving correctness of fixed logic to bounding the behavior of adaptive, data-driven systems operating under uncertainty.
This remainder of this section presents a practical, simulation-driven illustration to validating the perception, mapping (HD maps/digital twins), and localization layers of an autonomous driving stack. The core idea is to anchor tests in the operational design domain (ODD), express them as reproducible scenarios, and report metrics that connect module-level behavior to system-level safety.
We decompose the stack into Perception (object detection/tracking), Mapping (HD map/digital twin creation and consistency), and Localization (GNSS/IMU and vision/LiDAR aiding) and validate each with targeted KPIs and fault injections. The evidence is organized into a safety case that explains how module results compose at system level. Tests are derived from the ODD and instantiated as logical/concrete scenarios (e.g., with a scenario language like Scenic) over the target environment. This gives you systematic coverage and reproducible edge-case generation while keeping hooks for standards-aligned arguments (e.g., ISO 26262/SOTIF) and formal analyses where appropriate.
The objective is to quantify detection performance—and its safety impact—across the ODD. In end-to-end, high-fidelity (HF) simulation, we log both simulator ground truth and the stack’s detections, then compute per-class statistics as a function of distance and occlusion. Near-field errors are emphasized because they dominate braking and collision risk. Scenario sets should include partial occlusions, sudden obstacle appearances, vulnerable road users, and adverse weather/illumination, all realized over the site map so that failures can be replayed and compared.
Figure 1 explains object comparison. Green boxes are shown for objects captured by ground truth, while Red boxes are shown for objects detected by the AV stack. Threshold-based rules are designed to compare the objects. It is expected to provide specific indicators of detectable vehicles in different ranges for safety and danger areas.
Validation begins with how the map and digital twin are produced. Aerial imagery or LiDAR is collected with RTK geo-tagging and surveyed control points, then processed into dense point clouds and classified to separate roads, buildings, and vegetation. From there, you export OpenDRIVE (for lanes, traffic rules, and topology) and a 3D environment for HF simulation. The twin should be accurate enough that perception models do not overfit artifacts and localization algorithms can achieve lane-level continuity.
Key checks include lane topology fidelity versus survey, geo-consistency in centimeters, and semantic consistency (e.g., correct placement of occluders, signs, crosswalks). The scenarios used for perception and localization are bound to this twin so that results can be reproduced and shared across teams or vehicles. Over time, you add change-management: detect and quantify drifts when the real world changes (construction, foliage, signage) and re-validate affected scenarios.
Here, the focus is on the robustness of ego-pose to sensor noise, outages, and map inconsistencies. In simulation, you inject GNSS multipath, IMU bias, packet dropouts, or short GNSS blackouts and watch how quickly the estimator diverges and re-converges. Similar tests perturb the map (e.g., small lane-mark misalignments) to examine estimator sensitivity to mapping error.
The following is a short KPI list:
The current validation methods perform a one-to-one mapping between the expected and actual locations. As shown in Fig. 2, for each frame, the vehicle position deviation is computed and reported in the validation report. Later parameters, like min/max/mean deviations, are calculated from the same report. In the validation procedure, it is also possible to modify the simulator to embed a mechanism to add noise in the localization process to check the robustness and validate its performance.
A two-stage workflow balances coverage and realism. First, use LF tools (e.g., planner-in-the-loop with simplified sensors and traffic) to sweep large grids of logical scenarios and identify risky regions in parameter space (relative speed, initial gap, occlusion level). Then, promote the most informative concrete scenarios to HF simulation with photorealistic sensors for end-to-end validation of perception and localization interactions. Where appropriate, a small, curated set of scenarios is carried to closed-track trials. Success criteria are consistent across all stages, and post-run analyses attribute failures to perception, localization, prediction, or planning so fixes are targeted rather than generic.
The chapter develops a comprehensive view of perception, mapping, and localization as the foundation of autonomous systems, emphasizing how modern autonomy builds on both historical automation (e.g., autopilots across domains) and recent advances in AI. It explains how perception converts raw sensor data—across cameras, LiDAR, radar, and acoustic systems—into structured understanding through object detection, sensor fusion, and scene interpretation. A key theme is that no single sensor is sufficient; instead, robust autonomy depends on multi-modal sensor fusion, probabilistic estimation, and careful calibration to manage uncertainty. The chapter also highlights the transformative role of AI, particularly deep learning, in enabling scalable perception and scene understanding, while noting that these methods introduce new challenges related to data dependence, generalization, and interpretability.
A second major focus is on sources of instability and validation, where the chapter connects environmental effects (weather, electromagnetic interference), infrastructure constraints, and semiconductor economics to system-level performance. It underscores that validation must be grounded in the operational design domain (ODD) and cannot rely solely on physical testing, requiring a combination of simulation, hardware-in-the-loop, and scenario-based methods. The introduction of AI further complicates verification and validation because of its probabilistic, non-deterministic nature, challenging traditional assurance techniques. As a result, safety approaches across domains are evolving toward lifecycle-based assurance, incorporating data governance, simulation-driven testing, and continuous monitoring. The chapter concludes with a structured validation framework that links perception, mapping, and localization performance to system-level safety metrics, emphasizing reproducibility, coverage, and traceability in building a credible safety case.
Autonomous vehicles do not become safe only by perceiving the world correctly. They must also decide what to do, plan a feasible motion, and execute that motion through the vehicle actuators. This chapter focuses on the part of the autonomy stack that transforms environmental understanding into safe action: decision-making, motion planning, and control. Perception and localisation estimate the state of the vehicle and its surroundings; prediction estimates how other actors may move; decision-making selects the intended maneuver; motion planning generates a feasible trajectory; control tracks that trajectory through steering, braking, throttle, thrust, or other actuators; and monitoring and fallback functions supervise the result and trigger replanning or minimal-risk behavior when needed.
These functions are strongly interdependent. A motion planner may generate a valid trajectory in isolation, but still fail at system level if perception is delayed, localisation drifts, prediction is uncertain, or the controller cannot physically track the planned path. Likewise, a controller may perform well against a clean reference trajectory, but still produce unsafe or uncomfortable behavior if the planner generates abrupt, infeasible, or poorly timed commands. For this reason, planning and control validation cannot be limited to unit testing of individual algorithms. Unit testing is necessary, but it must be connected to integration testing, system-level scenario testing, and operational validation inside the intended Operational Design Domain (ODD).
The goal of this chapter is therefore not only to introduce common control and planning methods, but to show how they are validated as part of a complete autonomous vehicle system. The chapter positions decision-making, motion planning, and control within the autonomy stack, then discusses the main methods and architectures used in these layers, including classical control, AI-based control, behavioural decision-making, and trajectory planning. The emphasis is placed on validation implications: what must be checked, what evidence is needed, and how failures may appear when the component is integrated into the full system.
A central idea in this chapter is that planning and control validation should be scenario-based. Autonomous vehicles operate in environments where safety depends on interactions between the ego vehicle, other road users, infrastructure, road geometry, environmental conditions, and vehicle dynamics. Therefore, it is not enough to ask whether a planner works for one example case. Instead, engineers must define scenario families, vary their parameters, measure system behavior, and evaluate whether the vehicle remains safe, legal, comfortable, and robust under the expected range of conditions.
The chapter also connects planning and control validation to the broader systems-engineering process. In the V-model perspective introduced earlier in the handbook, planning and control appear at several validation levels: component verification, integration testing, system validation, and operational validation. By the end of this chapter, the reader should understand how to move from algorithm descriptions to a validation workflow: define the function under test, identify its ODD and assumptions, design scenarios, select measurable performance and safety criteria, choose suitable test methods, and package the resulting evidence for safety assurance.
Planning and control are the parts of an autonomous system where perception turns into action. A vehicle does not become autonomous simply by detecting objects or building a map of the world. It must also decide what to do next, determine how that decision can be executed safely, and then convert the planned motion into actuator commands. In other words, this chapter sits in the middle of the autonomy loop: it connects the understanding of the environment to the physical behavior of the vehicle.
The closed-loop chain can be viewed as follows:
This chain is important because no single layer is sufficient on its own. A motion planner can produce a good trajectory in isolation, but the same plan may become unsafe if perception is delayed, prediction is wrong, localisation drifts, or the controller cannot physically track the path. A controller may behave correctly on a clean reference trajectory, but still create unsafe behavior if the planner issues abrupt commands or if the vehicle state changes faster than the controller can respond. For that reason, planning and control must be treated as a system-level function, not only as a set of individual algorithms.
| Layer | Main role | Typical input | Typical output | Main validation question |
|---|---|---|---|---|
| Decision-making | Chooses the maneuver or behavior | Goals, scene context, traffic rules, mission state | Stop, yield, follow, overtake, lane change, reroute | Is the chosen behavior correct and rule-compliant? |
| Motion planning | Converts intent into a path or trajectory | Behavioral decision, map, obstacles, vehicle constraints | Safe and feasible trajectory | Can the vehicle execute this path safely and legally? |
| Control | Tracks the planned trajectory | Trajectory, vehicle state, actuator feedback | Steering, braking, throttle, thrust commands | Can the vehicle follow the plan within dynamics and timing limits? |
| Monitoring / fallback | Detects unsafe or degraded execution | Residuals, health signals, timing, confidence metrics | Replanning, slowdown, minimal-risk maneuver, safe stop | Does the system recover safely when things go wrong? |
This structure is useful because it keeps the chapter focused on the system function rather than on one algorithm family only. In an autonomous vehicle, the interesting question is not merely whether a controller works, but whether the complete decision–planning–control chain behaves safely and predictably in the intended Operational Design Domain (ODD).
Compared with perception and localisation, this chapter deals more directly with action selection and vehicle motion. That makes the safety implications more immediate. A perception error may be serious, but a planning or control error can immediately turn into unsafe motion. This is why planning and control usually require tight timing, careful supervision, and explicit fallback behavior.
At the same time, this layer is also more tightly coupled to vehicle dynamics than higher-level software. The planner cannot ignore turning radius, braking distance, acceleration limits, road friction, actuator delay, or comfort constraints. A behavior module cannot ignore traffic rules, interaction with other agents, or the fact that some maneuvers are safe only in certain conditions. Planning and control therefore sit at the intersection of:
The same functional chain exists in all autonomy domains, but the emphasis changes with the physical environment.
| Domain | Main emphasis | Typical planning and control style |
|---|---|---|
| Ground systems | Human interaction, road rules, friction-limited dynamics, low-latency reaction | reactive planning, trajectory tracking, stop/yield behavior, comfort-aware control |
| Airborne systems | Stability, altitude, airspace rules, weather, safety margins | layered control, outer-loop guidance, strict envelope protection |
| Marine systems | Disturbances from waves, currents, wind, sparse infrastructure, long duration | robust control, waypoint navigation, energy-aware mission planning |
| Space systems | Communication delays, orbital mechanics, limited actuation, no real-time human intervention | model-based control, mission planning, trajectory optimization under physical constraints |
Ground vehicles must handle dense interaction with pedestrians, cyclists, lane markings, signals, and other vehicles. Airborne systems face a three-dimensional environment where stability and safety margins dominate. Marine systems operate more slowly, but disturbances and sparse sensing make robustness essential. Space systems are the most constrained of all: decisions are often conservative, highly validated, and based on first-principles dynamics because real-time intervention is limited or impossible.
This is why the chapter cannot treat control and planning as a single universal recipe. The underlying logic is shared, but the validation target changes depending on the domain, the vehicle dynamics, the available sensors, and the consequences of failure.
In this chapter, we are able to follow the chain from a high-level behavior down to the executed motion:
This chapter, therefore, serves as a link between understanding the environment and demonstrating that the vehicle can operate safely within it. It prepares the reader for the next sections, where the main control strategies, planning architectures, and validation methods are discussed in more detail.
Having located planning and control within the autonomy stack, the next step is to examine the principal methods used to implement this layer. These methods fall broadly into classical control, AI-based control, behavioral decision-making, and motion-planning architectures, each with different implications for validation, traceability, and safety assurance.
The chapter therefore distinguishes between two linked topics:
A useful way to view the full chain is:
The distinction is important. A behavior module may be correct in isolation, but if it produces an aggressive maneuver, the motion planner may fail to find a safe trajectory. Likewise, a good planner may still produce unsafe results if the controller cannot track the planned motion or if the system state estimate is inaccurate. This is why planning and control must be presented together, even if they are implemented by different software modules.
Classical control strategies form the foundation of most vehicle control systems. They are based on mathematical models of the vehicle and on feedback control principles that relate the current vehicle state to a desired reference. Their main advantage is that they are well understood, mathematically structured, and comparatively easy to analyze.
| Aspect | Description |
|---|---|
| Core idea | Measure the current state, compare it to the desired state, compute the error, and generate a control action that reduces the error. |
| Typical model | Vehicle dynamics, often simplified or linearized around a nominal operating point. |
| Main strength | Predictability, stability analysis, transparency, and mature engineering practice. |
| Main limitation | Performance depends strongly on model accuracy and operating conditions. |
1. PID control The proportional-integral-derivative controller is the most widely used classical method. It computes control action from:
PID control is widely used for speed regulation, yaw-rate control, heading correction, and other low-level vehicle functions where the controlled variable must stay near a reference value.
2. LQR control The linear quadratic regulator is an optimal control method that minimizes a cost function balancing tracking error and control effort. It requires a linear or linearized model and is often used for trajectory tracking, stabilization, and path-following tasks.
3. State estimation Classical control is rarely used alone. Controllers usually depend on estimated states obtained through Kalman filters or related observers. These estimators combine measurements from multiple sensors such as IMU, GNSS, wheel-speed sensors, and steering sensors to produce a more reliable estimate of position, velocity, orientation, and yaw rate.
4. Sliding mode control Sliding mode control is a robust control method designed to keep the system on a predefined surface despite disturbances and model uncertainty. It is attractive in cases where the system must tolerate external perturbations, though its practical tuning can be more demanding.
Classical control methods are attractive in safety-critical systems because they offer a number of practical advantages:
These strengths do not remove the need for validation, but they make the validation argument more structured because the controller behavior can often be bounded more directly.
The limitations of classical control appear when the operating conditions move away from the assumptions used in design.
For this reason, classical control remains highly valuable, but it is often combined with higher-level decision logic or learned components.
AI-based control strategies use data-driven methods to learn control behavior from examples, simulations, or interaction with the environment. Instead of specifying the full model and control law manually, the system learns a mapping from state or observation to action.
| Aspect | Description |
|---|---|
| Core idea | Learn a control policy from data, simulation, or interaction. |
| Typical tools | Reinforcement learning, supervised learning, neural network controllers, learned models for predictive control. |
| Main strength | Ability to represent complex nonlinear behavior and adapt to rich environments. |
| Main limitation | Harder to interpret, certify, and guarantee under all operating conditions. |
1. Reinforcement learning Reinforcement learning learns a policy through trial and error. The controller receives observations, selects actions, and receives rewards or penalties. Over time, it tries to maximize cumulative reward. In autonomous driving, this can be used for behavior selection, trajectory control, or adaptive decision-making in simulation-based training environments.
2. Supervised learning for control In supervised approaches, the model is trained on expert demonstrations or labeled data. It learns to imitate desired control outputs. This can be useful when expert behavior is available and when the goal is to reproduce a known control style or policy.
3. Neural network controllers A neural network may directly map current state estimates to steering, braking, throttle, or other actuator commands. This can be effective in highly nonlinear settings, but it creates validation challenges because the internal decision process is not as transparent as in classical control.
4. Learned models inside MPC Model predictive control can remain the outer optimization structure while a learned model replaces or augments part of the vehicle dynamics model. This can improve accuracy in difficult-to-model situations while preserving the optimization structure of MPC.
AI-based control can be powerful, but its validation burden is significantly higher.
These characteristics do not make AI-based control unsuitable, but they mean that the validation strategy must be broader, more scenario-driven, and more conservative in how safety is claimed.
The main concerns for AI-based control are:
For these reasons, AI-based control is often paired with explicit runtime monitoring, fallback logic, and classical low-level control.
Behavioral algorithms decide the next maneuver or driving intention. They answer questions such as:
Behavioral algorithms define the transition from perception and prediction to motion planning. They are the bridge between what the vehicle understands and what it intends to do.
| Method | Main idea | Typical use |
| — | — | — |
| Finite State Machines | Use discrete states and transitions to represent maneuvers | lane following, stop-and-go logic, lane change preparation |
| Hierarchical State Machines | Organize behaviors in layers with mission-level and maneuver-level states | more complex driving logic with nested decisions |
| Behavior Trees | Use tree structures with selectors, sequences, and conditions | modular, readable, reusable maneuver logic |
| Rule-Based Systems | Apply explicit if-condition-then-action rules | traffic-law compliance, right-of-way handling |
| Utility-Based Methods | Score candidate behaviors and choose the best option | trade-offs between safety, efficiency, and comfort |
| Learning-Based Behavior | Learn behavior selection from data or interaction | adaptive maneuver selection, complex interaction handling |
Behavioral logic must be validated for:
Behavioral algorithms are therefore not just a software convenience. They are part of the safety argument because they determine when the vehicle acts, when it waits, and when it escalates to a fallback behavior.
Motion planning turns a behavioral decision into a concrete path or trajectory. If the behavioral layer decides *what* the vehicle should do, motion planning determines *how* it can do it safely.
1. Grid-based planning Grid planners such as A* or D* work on a discretized map. They are useful for route-level or map-based planning and can be efficient when the environment is represented in a structured way.
2. Sampling-based planning Methods such as RRT, RRT*, and PRM explore the state space by sampling and connecting feasible states. They are useful in higher-dimensional planning problems and in situations where exact search is difficult.
3. Optimization-based planning Trajectory optimization and MPC-style planning formulate motion planning as an optimization problem. They are especially suitable for generating smooth trajectories while respecting constraints on collision avoidance, curvature, acceleration, and comfort.
4. Potential-field methods These methods use attractive and repulsive forces to guide the vehicle. They are conceptually simple, but they may suffer from local minima or unstable motion in complex scenes.
5. Lattice-based planning Predefined motion primitives are searched to build feasible trajectories. This is efficient and often attractive for road vehicles because it can respect kinodynamic constraints more naturally.
Motion planning must satisfy several requirements at the same time:
Motion planning is therefore best understood as a constrained decision problem, not merely a geometric search problem.
In production systems, purely classical or purely AI-based solutions are uncommon. Most autonomy stacks are hybrid.
| Hybrid pattern | Typical structure | Why it is useful |
|---|---|---|
| AI high-level, classical low-level | AI selects maneuvers; classical controller tracks trajectory | preserves transparent low-level actuation |
| AI model, classical controller | Learned vehicle model inside a classical MPC framework | improves model accuracy while keeping optimization structure |
| Classical baseline, AI exception handling | Classical controller handles normal cases; AI assists in rare cases | keeps the core system predictable |
| Safety layer plus learned policy | Learned planner or controller is constrained by a safety monitor | useful for bounding behavior in difficult environments |
| Tuned hybrid controller | Neural network adjusts gains or parameters of a classical controller | combines adaptability with known control structure |
These architectures reflect the practical reality of autonomy engineering. The most safety-critical elements are usually kept as transparent and bounded as possible, while AI is introduced where it adds the most value and where validation methods are strong enough to support it.
The main lesson of this subsection is that planning, control, and decision-making must be treated together. Classical methods provide structure, transparency, and a strong safety tradition. AI-based methods provide flexibility and a way to address complexity that is difficult to model explicitly. Behavioral algorithms and motion planners convert system understanding into executable motion. Hybrid systems combine these strengths and are often the most realistic route to deployable autonomy.
The next step is to validate these methods not only in isolation, but as parts of a system. For that reason, the following subsection should move from methods to validation, scenario design, and testing.
Planning and control must be validated as a system function, not only as isolated algorithms. A planner may produce a technically correct trajectory, and a controller may track a reference path accurately, but the combined behavior can still be unsafe if perception is delayed, localization drifts, prediction is wrong, or the actuation path introduces latency. The validation view therefore focuses on the complete decision–execution loop and asks whether the autonomous vehicle behaves safely, predictably, and consistently inside its intended operating conditions.
The purpose of this subsection is to define how planning and control should be evaluated across different validation levels. The key point is that component tests are useful, but they are only one part of the evidence chain. The reader should be able to trace a planning or control function from its local behavior to its interaction with the full vehicle system, and then to the operating domain where the vehicle is actually expected to function.
<!– Figure comment: Validation view from component verification to system and operational validation –>
The validation process can be understood as a progression from local correctness to system-level safety. This progression is especially important in the planning and control layer, because the behavior of the system depends on tightly coupled interactions between the decision layer, the planner, the controller, the vehicle model, and the environment.
| Validation level | What is checked | Example for planning and control | Typical evidence |
|---|---|---|---|
| Unit level | The component in isolation | Planner logic, controller law, fallback trigger, trajectory tracking rule | Component test results, assertion checks, interface tests |
| Integration level | Interaction between modules | Planner with localization, prediction, and control interface | Closed-loop integration logs, timing traces, message traces |
| System level | Full vehicle behavior in a closed loop | Lane change, overtaking, stopping, obstacle avoidance | Scenario results, system KPIs, safety metrics |
| Operational level | Behavior in the intended use context | Validated operation inside the ODD, track, or field pilot | Validation report, field data, pass/fail evidence |
This distinction matters because a good unit test does not guarantee safe system behavior. A motion planner can be correct as a software module and still create unsafe motion if it receives stale state information or if the controller cannot follow the generated path within the vehicle’s physical limits. Likewise, a controller can be stable in isolation but still produce an unsafe outcome if the trajectory itself is too aggressive or if the vehicle state changes faster than expected.
In the V-model perspective used throughout the handbook, planning and control occupy the portion where implementation evidence must be mapped back to the system requirements. The chapter does not need to repeat the general V-model explanation here. It is enough to show that planning and control are evaluated at several points along that validation chain: first as modules, then as integrated functions, and finally as part of the complete autonomous vehicle.
The validation question is not simply “does the algorithm work?” It is “does the vehicle behave safely and correctly when the algorithm is embedded in the full autonomy stack?” That means the evaluation must cover the following aspects together:
| Validation aspect | What it means in practice |
|---|---|
| Functional correctness | The maneuver or trajectory matches the intended behavior |
| Physical feasibility | The motion can be executed by the vehicle without violating dynamics |
| Safety | The vehicle avoids collisions and unsafe close approaches |
| Rule compliance | The motion respects traffic rules, road geometry, and operational constraints |
| Robustness | The behavior remains acceptable under uncertainty, delays, and disturbances |
| Comfort | The motion does not introduce excessive jerk, sharp braking, or unstable steering |
| Timeliness | The planner and controller act within the response time allowed by the scenario |
Planning and control are sensitive to the assumptions behind the system. A small change in localization accuracy, actuation delay, road friction, or prediction uncertainty may produce a very different trajectory outcome. For this reason, validation should not be framed as a single pass/fail test on one nominal case. It should be framed as a collection of evidence showing that the system remains acceptable across the planned range of operating conditions.
The planning and control layer is best validated through a chain of evidence. First, the team defines a maneuver or mission objective. Then the system assumptions and operating constraints are specified. Next, scenarios are generated to exercise the maneuver under controlled variation. After that, simulation, closed-loop execution, and physical confirmation are used to check the system response. Finally, the results are expressed in measurable metrics and tied back to the safety argument.
This logic is already visible in the current material, which treats digital twins as the basis for meaningful simulation, uses design-of-experiments to stress the decision and control logic, and combines local properties such as trajectory tracking with system-level effects such as minimum distance to collision. It also emphasizes that the simulator must remain predictive as the product evolves, so that post-deployment logs, updated vehicle parameters, and map changes can be folded back into continuous validation.
The important design principle is that planning and control validation should support both:
1. **local evidence**, where the behavior of a single planner or controller can be checked; 2. **system evidence**, where the combined behavior of the autonomy stack is evaluated in closed loop.
This is why scenario execution, digital twin fidelity, and timing realism matter. If the virtual environment is too abstract, the test may not reveal the same failure modes that appear in the real vehicle. If the virtual environment is too expensive or too detailed, the test program may not scale to a useful number of scenarios. Validation therefore needs a balance between breadth and realism.
For this chapter, the most useful evidence types are trajectory evidence, timing evidence, and safety evidence.
| Evidence type | Example output | Why it matters |
|---|---|---|
| Trajectory evidence | Path error, tracking error, lane deviation, path smoothness | Shows whether the plan can be executed as intended |
| Timing evidence | Planner latency, controller latency, response delay | Shows whether the system reacts quickly enough |
| Safety evidence | Collision result, TTC, DTC, minimum distance | Shows whether the behavior remains safe |
| Robustness evidence | Performance under sensor delay, localization drift, or actuation variation | Shows whether the result survives uncertainty |
| Operational evidence | Performance inside the intended ODD | Shows whether the system is ready for realistic use |
These evidence types should be collected together, not separately. A vehicle that tracks a path accurately but violates safety margins is not acceptable. A vehicle that avoids collision but behaves erratically or unpredictably is also not acceptable. The validation view therefore requires a combined reading of the metrics rather than a single score.
The planning and control layer is the point where autonomous behavior becomes visible in the physical world. A mistake here is not only a software error; it is an action error. That is why this subsection must be treated as a bridge between the planning algorithms described earlier and the scenario-based test methodology that follows. It prepares the reader to ask the right questions: what should be tested, at what level, under which conditions, and with what evidence.
The next subsection should therefore move from this validation view to the concrete generation of test scenarios, logical ranges, and executable cases.
Scenario design is the point where planning and control validation becomes concrete. A control or planning function cannot be evaluated only through abstract claims such as “the planner is safe” or “the controller is robust.” It must be tested in situations that represent the intended operational design domain, with parameters that can be varied systematically and outcomes that can be measured consistently. For this reason, scenario design is the bridge between system requirements and executable validation cases.
The purpose of this subsection is to show how a validation idea becomes a testable scenario. The basic progression is simple: a use case is described in natural language, that description is turned into a structured logical scenario, and then the logical scenario is instantiated into concrete test cases. This structure allows the same planning or control function to be tested across many variations of speed, distance, road geometry, actor behavior, visibility, and vehicle state.
<!– Figure comment: scenario funnel from functional scenario to logical scenario to concrete test cases –>
A useful way to organize validation is to distinguish three levels of scenario abstraction.
| Scenario level | What it represents | Example |
|---|---|---|
| Functional scenario | A human-readable description of the situation | Ego vehicle overtakes a slower lead vehicle |
| Logical scenario | Parameter ranges and constraints | Ego speed 20–40 km/h, lead speed 5–20 km/h, initial gap 10–40 m |
| Concrete scenario | One executable test case with fixed values | Ego speed 30 km/h, lead speed 10 km/h, gap 20 m, dry daylight |
The functional scenario is the most accessible starting point. It tells the reader what type of driving situation is being examined, but it does not yet define a test. The logical scenario turns that situation into a parameterized family of cases. This is where validation becomes systematic, because the same scenario can be repeated with different values for speed, distance, weather, road shape, and other variables. The concrete scenario is the final executable instance. It is the single run that appears in simulation, on a test track, or in a monitored field trial.
This three-step structure is especially useful for planning and control because these functions are highly sensitive to context. A lane change that is safe at low speed and with a large gap may become unsafe when the lead vehicle is faster, when the adjacent lane contains a moving actor, or when the controller reacts too late. Scenario abstraction therefore helps the engineer separate the behavior that should remain stable from the factors that are deliberately varied.
A good scenario is more than a scene description. It should define the elements that matter to the validation question.
| Scenario element | What should be specified |
|---|---|
| Ego vehicle state | Position, speed, heading, acceleration, planned maneuver |
| Other actors | Number, type, speed, intent, motion constraints |
| Road and infrastructure | Lane geometry, signs, signals, curbs, merges, intersections |
| Environment | Weather, light, visibility, road friction, occlusions |
| Timing | Initial time, trigger moment, reaction window, duration |
| Vehicle limits | Braking capability, turning radius, steering limits, comfort bounds |
| System assumptions | Localization quality, perception delay, prediction horizon, fallback behavior |
These elements define the test context and prevent ambiguity. If a scenario does not explicitly specify the initial state, the actor behavior, or the environmental constraints, then it is difficult to reproduce the test or interpret the result. The goal is not to overspecify every detail, but to identify the variables that change the planning and control response.
For planning and control, the most important scenario variables often include vehicle speed, initial separation, relative speed, road curvature, lane availability, traffic density, and timing delay. Those are the factors that tend to change whether a maneuver is safe, whether a controller can track the trajectory, and whether the system can recover when conditions become difficult.
Once the scenario family is defined, the next step is to select concrete cases that stress the planning or control logic in meaningful ways. This is where design of experiments becomes useful. Rather than testing one nominal case repeatedly, the engineer deliberately varies the scenario parameters so that the same maneuver is evaluated under different conditions. This reveals which factors have the strongest effect on safety and performance.
For example, a lane-change scenario can be parameterized by:
By varying these parameters systematically, the validation team can identify boundary conditions. Some cases may be clearly safe, some clearly unsafe, and some close to the operational limit. The purpose of scenario design is to expose those limits early, before the system is accepted as safe for deployment.
This is also where scenario-based validation becomes more useful than mileage alone. Distance driven tells us how much the vehicle has moved. Scenario coverage tells us what kinds of situations the vehicle has actually faced. For planning and control, the second is much more informative than the first.
Different validation questions require different scenario families. The table below gives a practical organization.
| Scenario family | Typical question |
|---|---|
| Lane change and overtaking | Can the vehicle choose and execute a safe passing maneuver? |
| Cut-in and cut-out | Can the vehicle handle a nearby actor entering or leaving the lane? |
| Obstacle avoidance | Can the planner redirect the vehicle around a static or dynamic obstacle? |
| Stop and yield | Does the vehicle slow down or stop correctly at crosswalks, intersections, or merges? |
| Following behavior | Can the vehicle maintain safe headway and stable tracking behind another vehicle? |
| Emergency behavior | Does the system transition safely to a fallback state when the plan becomes unsafe? |
Each family should be translated into a logical scenario description and then into a set of concrete test cases. The same maneuver can be repeated across different speeds, actor behaviors, road geometries, and environmental conditions, which makes it possible to compare outcomes and identify trends rather than isolated events.
A scenario is only useful if its output can be interpreted consistently. For planning and control, the result should be expressed through clear labels and quantitative metrics.
| Outcome label | Meaning |
|---|---|
| Success | The maneuver was completed safely and within the expected constraints |
| Collision | The scenario resulted in an impact |
| Separation violation | The vehicle came too close to another actor or obstacle |
| Excessive deceleration | The vehicle behaved too aggressively or uncomfortably |
| Long pass without return | The vehicle completed the maneuver but failed to return to the nominal path or lane behavior |
| Timeout | The system failed to complete the maneuver in the allotted time |
These labels are useful because they tie the scenario directly to system behavior. They help distinguish between a planner that is merely inefficient, a controller that is merely slow, and a system that is actually unsafe. The purpose of scenario design is not only to generate runs, but to make the runs interpretable.
Scenario design is the foundation for the later validation methods. Simulation, software-in-the-loop, hardware-in-the-loop, formal falsification, test-track execution, and field trials all depend on having scenarios that are well-defined and reproducible. If the scenario itself is vague, then the test evidence becomes weak, even if the simulator or test track is highly realistic.
For that reason, this subsection should be read as the preparatory stage for the test-method section that follows. It defines what should be tested, how the test family should be structured, and which parameters should be varied. The next subsection can then explain how those scenarios are executed through simulation, formal methods, and physical testing.
Scenario design tells us what should be tested. Test methods determine how those scenarios are executed, what kind of evidence is collected, and how the results are translated into a validation argument. For planning and control, this distinction matters because the same concrete scenario can be exercised in several different ways: first in simulation, then in software-in-the-loop or hardware-in-the-loop settings, then on a controlled test track, and finally in a monitored real-world environment. The book already follows this logic in its current material, where physical testing, real-world seeding, and virtual testing are treated as three complementary ways of generating and executing tests, each with different strengths and limitations.
The central idea is that the scenario from the previous subsection must be brought into a test environment that is suitable for the question being asked. A lane-change scenario, for example, may first be explored in CARLA or another simulator, then repeated with software-in-the-loop or hardware-in-the-loop components, then confirmed at a controlled proving ground such as ZalaZONE, and finally monitored in limited real-world operation. In each case, the test method changes, but the underlying scenario remains the same. That is what makes the validation evidence comparable.
Simulation is the first and most flexible execution method. It allows the team to test a large number of scenario variants quickly, safely, and repeatably. This is especially important for planning and control, because the behavior of these modules depends strongly on speed, spacing, road geometry, actor behavior, and timing. A simulator can sweep those parameters systematically and expose boundary cases that would be too risky or too expensive to reproduce physically.
The current book already contains a strong simulation toolbox, and that material should be used directly here. For ground systems, CARLA is a natural open-source choice for academic and research work because it supports realistic urban scenes and sensor stacks. NVIDIA DRIVE Sim is useful when the goal is GPU-accelerated synthetic data and digital-twin style validation. IPG CarMaker, dSPACE ASM, VIRES VTD, Applied Intuition, Cognata, and MathWorks tools can be used when the focus shifts toward closed-loop vehicle dynamics, scenario coverage, or industrial validation workflows. These platforms are not identical, and that is part of the point: some are better for scenario breadth, some for sensor realism, some for controller validation, and some for integration with SIL and HIL.
For planning and control, simulation is especially useful when the test objective is one of the following:
A practical way to use simulation is to split it into two layers. Low-fidelity simulation is used first to sweep large scenario spaces quickly and identify where safety margins begin to tighten. High-fidelity simulation is then used for the most important cases, where sensor realism, closed-loop dynamics, and timing behavior matter more. The book’s current material already describes this logic: low-fidelity simulation is useful for broad exploration, while high-fidelity simulation is used to replay informative cases with more realism and to connect the result to later track testing.
Simulation becomes more valuable when the actual autonomy stack is connected to it. In software-in-the-loop testing, the real planning or control software runs inside the virtual environment. This is useful because it tests the actual code while keeping the physical risk low. If the software produces the wrong maneuver, the wrong timing, or the wrong fallback action, the error can be observed in a safe and repeatable setting.
Hardware-in-the-loop adds another layer of realism. It places real or representative hardware into the loop, such as ECUs, data buses, actuator interfaces, or timing elements. This is particularly important for planning and control, because the question is often not only whether the algorithm is correct, but whether the command reaches the vehicle correctly and on time. A planner that works in software may still fail once the actuation path, timing jitter, or bus communication is introduced.
The current manuscript already gives a good example of this in the discussion of virtual ECUs and data buses, where the test rig can simulate bus traffic, counters, checksums, subsystem failures, and graceful degradation. That material fits naturally here because it shows how HiL-style twins help validate actuator-path integrity without requiring a full physical rig.
Test tracks are the bridge between simulation and real-world operation. They provide physical realism while preserving a controlled environment in which scenarios can be repeated, instrumented, and compared. This makes them ideal for confirming whether a scenario that worked in simulation also behaves correctly on a vehicle with real dynamics, real sensing, and real timing.
One of the ground-systems test track examples is ZalaZONE in Hungary. ZalaZONE includes a Smart City Zone, highway and rural sections, a high-speed oval, dynamic platform, wet and dry handling courses, off-road areas, and V2X/5G infrastructure. It also supports simulation and digital-twin integration through tools such as IPG CarMaker and AVL, making it especially useful for SIL and HIL validation alongside physical track tests.
<!– Figure comment: ZalaZONE autonomous test track –>
Test-track validation is particularly suitable for:
The strength of a test track is controllability. The same maneuver can be repeated under carefully defined conditions, and the result can be compared against the corresponding simulation case. This makes it possible to isolate whether an unsafe outcome came from the scenario itself, the planner, the controller, the localization path, or the actuation behavior.
The chapter should also keep the existing infrastructure discussion on sensor and EMC testing, because that supports the broader idea of physical validation. Anechoic chambers, fully anechoic chambers, semi-anechoic chambers, RF-shielded rooms, and reverberation chambers are important when sensor behavior, electromagnetic interference, and communication robustness need to be measured under controlled conditions. That content belongs here because planning and control depend on the quality and timing of the sensing stack, and sensing validation is part of what makes the test result credible.
<!– Figure comment: anechoic chamber –>
Real-world testing is the most demanding method because it captures the actual operational environment. It should therefore be used after the system has already shown acceptable behavior in simulation and on the track. The goal is not to replace simulation or track testing, but to confirm that the validated behavior survives contact with the real world.
The current material gives a useful distinction that should be preserved: one line of validation uses real-world experience as the starting point for further virtual testing, while another line uses the fleet or field itself as a large distributed testbed. The Tesla-style fleet approach is a good example of the first case, where data from the field is fed into a large-scale validation pipeline. Pegasus and the Warwick-related scenario database are good examples of the second, where observed situations are turned into reusable validation material. OpenSCENARIO 2.0 also belongs here because it supports symbolic, reproducible scenario generation based on structured descriptions rather than ad hoc test notes.
This section is also the right place to mention that test generation can be seeded by observed events. Real-world seeding is valuable because it gives the team real situations instead of purely synthetic ones. However, completeness is still an open issue, and there is always a risk that the collected database overrepresents familiar or already-seen conditions. That is why seeding should be treated as a source of test diversity, not as a complete validation solution.
Real-world testing is most useful when the question is:
The test method should follow the validation question.
| Validation question | Best method to start with |
|---|---|
| Can the planner produce a safe trajectory across many parameter combinations? | Simulation |
| Does the real software behave correctly in the virtual world? | Software-in-the-loop |
| Does timing, communication, and actuator integration work correctly? | Hardware-in-the-loop |
| Does the system behave correctly under controlled physical conditions? | Test track |
| Does the system remain safe in the intended operating environment? | Real-world testing |
This is not a rigid ladder. In practice, validation moves back and forth between methods. A track failure may lead to changes in the scenario model or the simulator. A simulation failure may lead to a revised controller or a narrower ODD. A real-world failure may lead to a new safety margin, a changed fallback rule, or a better test-track reproduction of the same case.
The important point is that each method contributes a different kind of evidence. Simulation gives scale, SiL and HiL give integration realism, test tracks give controlled physical confirmation, and real-world testing gives operational credibility. For planning and control, a credible validation strategy needs all of them, with the scenario from the previous subsection serving as the common reference across the different execution environments.
The results of planning and control testing should be recorded in a form that can be compared across methods and reused in the safety argument. The most useful evidence is:
These outputs should be interpreted together. A maneuver that is accurate but unsafe is not acceptable. A maneuver that is safe but erratic may also be unacceptable if it creates instability or poor comfort. The validation report should therefore link each result back to the scenario definition, the test method, and the original system requirement.
The role of this subsection is to turn scenarios into evidence. Simulation, track testing, and real-world testing are not competing methods; they are complementary layers of the same validation strategy. Simulation gives breadth, physical test tracks give controlled confirmation, and real-world operation gives the strongest form of deployment evidence. The next subsection can now focus on how these results are packaged into a validation argument and how they support the chapter summary.
This chapter has shown that planning, control, and decision-making are not isolated algorithmic topics, but a connected autonomy function that turns perception into motion and motion into system behavior. The chapter began by locating planning and control inside the autonomy stack, then explained how classical and AI-based control strategies, behavioral logic, and motion planning work together to produce an executable action. It then moved from method descriptions to a validation view, showing that the correct question is not only whether a planner or controller works in isolation, but whether the complete decision–execution loop behaves safely, predictably, and consistently inside the intended operational design domain.
A central message of the chapter is that validation must be layered. Planning and control should first be tested at component level, then at integration level, then at system level, and finally in the operational context where the vehicle is expected to function. A planner can look correct in a unit test and still fail when linked to localization drift, prediction uncertainty, or controller latency. A controller can track a clean reference trajectory and still produce unsafe or uncomfortable behavior if the planned motion is too aggressive or the actuation path is too slow. For that reason, this chapter emphasized the need to connect local correctness to system-level safety evidence. The V-model is relevant here, but only as a reference point for where planning and control sit in the broader assurance chain; it is not something that needs to be re-explained in detail in this subsection.
The chapter also made the case that scenario design is the bridge between requirements and tests. Validation becomes meaningful when functional scenarios are transformed into logical parameterized families and then into concrete test cases. That progression makes it possible to vary speed, separation, road geometry, actor behavior, timing, and environmental conditions in a disciplined way. In planning and control, those variations matter because the same maneuver can be safe in one context and unsafe in another. Scenario-based testing therefore provides a better foundation for validation than mileage alone, because it captures *which* situations were tested, not only *how far* the vehicle traveled.
The testing-method section then showed how those scenarios can be executed through simulation, software-in-the-loop, hardware-in-the-loop, test-track, and real-world testing. Simulation provides breadth, repeatability, and safe access to edge cases. SiL and HiL connect the scenario to the actual software and hardware execution path. Test-track validation confirms that the planned behavior survives physical dynamics, sensor timing, and controlled real-world interaction. Real-world testing provides the strongest operational evidence, but only after the system has already shown acceptable performance in lower-risk environments. The overall lesson is that no single method is sufficient by itself. A credible validation program uses all of them in sequence and interprets them as complementary evidence.
A further point emphasized in this chapter is that planning and control validation should produce evidence, not just test results. The value of the work lies in the traceable package of outputs: trajectory error, tracking error, Time-to-Collision, Distance-to-Collision, collision or near-miss outcomes, maneuver completion time, planner latency, controller response delay, comfort measures, and fallback behavior. These metrics should be interpreted together because safety, feasibility, timing, and comfort are all part of acceptable autonomous behavior. A vehicle that is accurate but unsafe is not acceptable; a vehicle that is safe but erratic is also not acceptable. The final validation argument must therefore be based on a coherent set of evidence rather than a single pass/fail label.
The chapter’s practical aim is to give the reader a repeatable method. Start from the function under test. Define the scenario family. Choose the right execution method. Measure the outcome against explicit criteria. Package the results as evidence. That method is the chapter’s real output, and it is the basis for the next chapters in the handbook.
Note for other partners: keep your chapter summaries in the same form: one paragraph on the main message, one on the validation logic, one on the scenario/testing connection, and one on the evidence package. Avoid repeating the same background definitions already given in the preface or earlier subsections.
Note for other partners: if your chapter includes concrete validation platforms or facilities, mention them inside the relevant method section, not in the summary. The summary should close the argument, not reopen the technical details.
The result is a chapter that moves from control and planning theory to a practical validation workflow for autonomous vehicles. That shift is what makes the chapter useful for productization, safety assurance, and later application in the exercise book and partner chapters.
Human–machine communication (HMC) is a critical safety and effectiveness layer across ground, aerospace, marine, and space systems, shaping how humans supervise, trust, and intervene in increasingly autonomous platforms. In aerospace, communication is highly structured and procedural, integrating pilots with automation through cockpit interfaces, alerts, and air traffic control, where clarity, workload management, and avoidance of mode confusion are paramount for safety. Marine systems emphasize long-duration situational awareness and often operate with reduced connectivity, requiring HMC that supports remote supervision, autonomy oversight, and coordination with human crews under uncertain environmental conditions. In space systems, communication is constrained by latency, limited bandwidth, and mission-critical stakes, driving the need for highly autonomous systems paired with carefully designed interfaces that allow operators to understand system state, diagnose anomalies, and issue high-level commands with confidence. However, ground systems face the greatest challenges in HMI.
Chapter two introduced the concept of safety and legal liability, and the key concept is expectation functions. That is, what is the expected behavior of the autonomous ground vehicle given a totally of the facts. Intimately connected to this concept is any communication between the autonomous vehicle and surrounding humans. This chapter focuses on how ground autonomous vehicles interact and communicate with people and their surrounding environment. As automation removes the human driver from the control loop, new forms of Human–Machine Communication (HMC) are required to ensure transparency, trust, and safety. The chapter examines how information is exchanged between vehicles, passengers, pedestrians, operators, and fleet managers through a variety of interfaces and communication modes. It introduces conceptual and practical frameworks such as Human–Machine Interfaces (HMI), the Language of Driving (LoD), and public acceptance mechanisms that together define how autonomy becomes understandable and socially integrated in everyday mobility.
This chapter explores the specificities of Human–Machine Interaction (HMI) in the context of autonomous vehicles (AVs). It examines how HMI in autonomous vehicles differs fundamentally from traditional car dashboards. With the human driver no longer actively involved in operating the vehicle, the challenge arises: how should AI-driven systems communicate effectively with passengers, pedestrians, and other road users?
HMI in AVs extends far beyond the driver’s dashboard. It defines the communication bridge between machines, people, and infrastructure — shaping how autonomy is perceived and trusted. Effective HMI determines whether automation is experienced as intelligent and reliable or opaque and alien.
Traditional driver interfaces were designed to support manual control. In contrast, autonomous vehicles must communicate intent, status, and safety both inside and outside the vehicle. The absence of human drivers requires new communication models to ensure safe interaction among all participants.
This section addresses the available communication channels and discusses how these channels must be redefined to accommodate the new paradigm. Additionally, it considers how various environmental factors—including cultural, geographical, seasonal, and spatial elements—impact communication strategies.
A key concept in this transformation is the Language of Driving (LoD) — a framework for structuring and standardizing how autonomous vehicles express awareness and intent toward humans (Kalda et al., 2022).
Understanding how humans perceive the world is crucial for autonomous vehicles to communicate effectively. Human perception is multimodal — combining sight, sound, motion cues, and social awareness. By studying these perceptual mechanisms, AV designers can emulate intuitive human signals such as:
Such behaviorally inspired signaling helps AVs become socially legible, supporting shared understanding on the road.
Driving is a social act. Culture, norms, and environment shape how humans interpret signals and movements. Autonomous vehicles may need to adapt their communication style — from light colors and icons to audio tones and message phrasing — depending on cultural and regional expectations.
Research explores whether AVs could adopt human-like communication methods, such as digital facial expressions or humanoid gestures, to support more natural interactions in complex social driving contexts.
Modern HMI systems increasingly rely on artificial intelligence, including large language models (LLMs), to process complex situational data and adapt communication in real time. AI enables:
The evolution toward AI-mediated interfaces marks a shift from fixed UI design toward conversational and contextual vehicle communication.
While the previous section described the foundations and goals of HMI, this section focuses on how autonomous vehicles communicate with various stakeholders and through which modes. These interactions can be categorized by user type, purpose, and proximity.
The vehicle–passenger interface supports comfort, awareness, and accessibility. It replaces the human driver’s social role by providing:
Passenger communication must balance automation with reassurance. In an Estonian field study (Kalda, Sell & Soe, 2021), over 90% of first-time AV users reported feeling safe and willing to ride again when the interface clearly explained the vehicle’s actions.
The vehicle–pedestrian interface (V2P) substitutes human cues such as eye contact or gestures. The *Language of Driving* (Kalda et al., 2022) proposes using standardized visual symbols, light bars, or projections to express intent:
Pedestrian communication must remain universal and intuitive, avoiding dependence on text or language comprehension.
At current autonomy levels (L3–L4), a safety operator interface remains essential. Two variants exist:
Teleoperation acts as a *bridge* between human oversight and full autonomy — essential for handling ambiguous traffic or emergency scenarios.
A dedicated maintenance interface enables technicians to safely inspect and update the vehicle:
Such interfaces ensure traceability, reliability, and compliance with safety regulations.
Fleet-level interfaces provide centralized control and analytics for multiple vehicles. They support:
These tools operate mainly over remote communication channels, relying on secure data infrastructure.
Autonomous vehicle interaction can be divided into direct (local) and remote (supervisory) communication:
| Type | Example | Key Features |
|---|---|---|
| Direct (Local) | Passenger, pedestrian, or on-site operator | Low latency, physical proximity, immediate feedback. |
| Remote (Supervisory) | Teleoperation or fleet control | Network-based, high security, possible latency. |
| Service-Level (Asynchronous) | Maintenance, updates, diagnostics | Back-end communication; focuses on reliability and traceability. |
To ensure that human–machine communication is intuitive and safe, several universal design principles apply:
When applied systematically, these principles make autonomous systems understandable, predictable, and trustworthy.
References
Kalda, K.; Pizzagalli, S.-L.; Soe, R.-M.; Sell, R.; Bellone, M. (2022). *Language of Driving for Autonomous Vehicles.* Applied Sciences, 12(11), 5406. [https://doi.org/10.3390/app12115406](https://doi.org/10.3390/app12115406)
Kalda, K.; Sell, R.; Soe, R.-M. (2021). *Use Case of Autonomous Vehicle Shuttle and Passenger Acceptance.* Proceedings of the Estonian Academy of Sciences, 70(4), 429–435. [https://doi.org/10.3176/proc.2021.4.09](https://doi.org/10.3176/proc.2021.4.09)
The Language of Driving (LoD) describes the implicit and explicit signals that allow autonomous vehicles and humans to understand each other in mixed traffic [1–3].
Driving behavior can be analyzed as a layered communication system:
An autonomous vehicle must infer human intent and simultaneously display legible intent of its own [2].
Driving “languages” vary globally; hence interfaces must maintain universal meaning while allowing local adaptation [1]. Behavior should be recognizable but not anthropomorphic, preserving clarity across cultures [3].
Field experiments using light-based cues have shown that simple color and motion patterns effectively communicate awareness and yielding. Participants reported improved understanding when signals were consistent and redundant across modalities [2].
Formalizing LoD as a measurable framework is essential for verification, standardization, and interoperability of automated behavior [3].
References: [1] Razdan, R. et al. (2020). *Unsettled Topics Concerning Human and Autonomous Vehicle Interaction.* SAE EDGE Research Report EPR2020025. [2] Kalda, K., Sell, R., Soe, R.-M. (2021). *Use Case of Autonomous Vehicle Shuttle and Passenger Acceptance.* Proc. Estonian Academy of Sciences, 70 (4). [3] Kalda, K., Pizzagalli, S.-L., Soe, R.-M., Sell, R., Bellone, M. (2022). *Language of Driving for Autonomous Vehicles.* *Applied Sciences*, 12 (11).
The integration of autonomous vehicles (AVs) into everyday traffic introduces both technological and societal challenges. While automated driving systems aim to eliminate human error and improve efficiency, the perceived safety and acceptance of these systems remain crucial for their widespread adoption. Ensuring that people *trust* the technology is equally important as ensuring that the technology *functions safely*.
Safety in autonomous mobility can be divided into two interdependent aspects:
Even if an AV operates flawlessly according to standards and regulations, users may still hesitate to use it unless the system communicates its actions clearly and behaves predictably. Thus, *trust* emerges as a measurable component of safety.
Public acceptance is closely linked to how transparently the system communicates its intentions and limitations. People expect autonomous vehicles to behave in a consistent and understandable manner — signalling when yielding, stopping, or resuming motion. Clear visual or auditory cues from the vehicle’s human–machine interface (HMI) can substantially increase user confidence.
Equally important is transparent communication from operators and authorities regarding how safety is managed, what happens in case of system failures, and how data is used. Misinformation or uncertainty during incidents may quickly erode public trust even if no technical fault has occurred.
Empirical research has shown that direct experience with AVs strongly increases trust. In one Estonian field study (Kalda, Sell & Soe, 2021), the majority of first-time users reported a high sense of safety and comfort, with over 90% indicating willingness to use autonomous shuttles again after their initial ride.
Such results confirm that personal experience and well-managed demonstrations are key factors in shaping public perception. People who interact directly with autonomous vehicles tend to transition from curiosity to trust, whereas those without exposure often remain cautious or skeptical. This highlights the importance of continuous testing, education, and public engagement.
Public acceptance extends beyond safety alone. It also encompasses questions of responsibility, fairness, accessibility, and societal impact. Autonomous transport must be inclusive and understandable to all citizens — regardless of age, digital literacy, or physical ability.
Ethical transparency, clear rules of accountability, and human-centered interface design all contribute to societal readiness for automation. Collaboration between engineers, psychologists, communication experts, and policy-makers is therefore essential to define a holistic framework of *social safety*.
Ensuring public confidence in autonomous mobility requires a balanced approach:
When these dimensions align, public acceptance evolves naturally, transforming initial curiosity and caution into trust and habitual use. The success of future autonomous mobility therefore depends not only on technological excellence but also on how well society understands and embraces it.
Reference: Kalda, K.; Sell, R.; Soe, R.-M. (2021). *Use Case of Autonomous Vehicle Shuttle and Passenger Acceptance.* Proceedings of the Estonian Academy of Sciences, 70(4), 429–435. [https://doi.org/10.3176/proc.2021.4.09](https://doi.org/10.3176/proc.2021.4.09)
Verification and Validation (V&V) of Human–Machine Interfaces (HMI) in autonomous vehicles ensure that communication between humans and intelligent systems is safe, intuitive, and consistent. While functional safety standards focus on the correct operation of sensors and control logic, HMI validation extends this to human comprehension, usability, and behavioral response [1–3].
The goal of HMI V&V is to confirm that:
The validation process therefore combines *technical testing* with *human-centered evaluation*.
Verification addresses whether the interface behaves as intended. Typical methods include:
Verification ensures consistency, latency limits, and redundancy across modalities before any user testing is performed.
Validation focuses on how people actually experience and understand the interface. This involves iterative testing with human participants in controlled and real-world environments [1–3]. Approaches include:
Results are analyzed to refine signal patterns, color codes, and message phrasing to improve intuitiveness and reduce confusion.
High-fidelity simulation environments enable early-stage evaluation of HMI without physical prototypes. Tools integrate virtual pedestrians, lighting, and weather to test how design choices influence visibility and legibility [3]. Virtual validation supports:
These techniques shorten development cycles and allow data-driven interface improvement.
To make validation reproducible, quantitative metrics are defined, such as:
Standardized metrics enable benchmarking across projects and support regulatory assessment of AV communication readiness.
HMI validation does not end with prototype testing. Field data from pilot deployments provide valuable feedback loops for ongoing improvement [2]. By combining simulation, real-world performance, and user analytics, HMI systems evolve continuously as technology and user expectations mature.
References: [1] Razdan, R. et al. (2020). *Unsettled Topics Concerning Human and Autonomous Vehicle Interaction.* SAE EDGE Research Report EPR2020025.
[2] Kalda, K., Sell, R., Soe, R.-M. (2021). *Use Case of Autonomous Vehicle Shuttle and Passenger Acceptance.* Proc. Estonian Academy of Sciences, 70 (4).
[3] Kalda, K., Pizzagalli, S.-L., Soe, R.-M., Sell, R., Bellone, M. (2022). *Language of Driving for Autonomous Vehicles.* Applied Sciences, 12 (11).
Effective verification and validation bridge the gap between technical functionality and human understanding. By ensuring that communication is accurate, interpretable, and trusted, these processes contribute directly to the safe and responsible deployment of autonomous mobility [1–3].
— MISSING PAGE — — MISSING PAGE —
| Domain | Primary Standards Body | Key Autonomy Standard |
|---|---|---|
| Ground | SAE | SAE J3016 |
| Ground | ISO | ISO 26262, ISO 21448 |
| Ground | UNECE | UN R157 |
| Airborne | RTCA | DO-178C, DO-365 |
| Airborne | FAA/EASA | UAV autonomy certification |
| Marine | IMO | MASS autonomy levels |
| Marine | DNV | Autonomous ship standards |
| Space | NASA | ALFUS autonomy framework |
| Space | CCSDS | Spacecraft autonomy protocols |
| Cross-domain | IEEE | IEEE 7000 series |
| Cross-domain | IEC | IEC 61508 |
| Cross-domain | NIST | AI Risk Management Framework |
Industries and Companies:
| Type | Description | Example Players (Companies / Organizations) |
|---|---|---|
| Regulators & Government Agencies | Define laws, certification pathways, and operational constraints for autonomous systems across domains (ground, air, marine, space). They translate legislation into enforceable rules and approvals. | NHTSA, FAA, EASA, International Maritime Organization, NASA, ESA |
| Standards Organizations / Industry Consortia | Develop technical standards, safety frameworks, and autonomy classification systems that regulators and industry rely on (e.g., SAE levels, ISO safety standards). | SAE International, ISO, IEEE, RTCA, ASTM |
| Legal & Advisory Firms | Interpret liability, compliance, and regulatory frameworks; support litigation, risk assessment, and policy strategy for autonomy deployments. | Baker McKenzie, DLA Piper, Latham & Watkins |
| Certification & Testing Authorities | Provide independent validation, certification audits, and compliance verification against safety standards (ASIL, DAL, etc.). Critical for market entry. | TÜV SÜD, UL Solutions, DNV |
| Simulation & Digital Twin Software Providers | Provide tools for scenario-based validation, digital twins, and V&V workflows across autonomy stacks (SIL/HIL, scenario generation, formal testing). | NVIDIA (DRIVE Sim), MathWorks, Ansys, Siemens |
| Test Track & Physical Testing Infrastructure Providers | Operate controlled environments for real-world validation (proving grounds, UAV corridors, maritime test ranges). Bridge sim-to-real validation. | American Center for Mobility, MCity, FAA UAV Test Sites |
Industries and Companies:
| Type | Description | Example Players (Companies) |
|---|---|---|
| Semiconductor Manufacturers (Logic & Compute) | Design and manufacture digital logic devices (MCUs, MPUs, SoCs, AI accelerators) that execute perception, planning, and control workloads in autonomous systems. | Intel, NVIDIA, Qualcomm, NXP Semiconductors |
| Analog & Mixed-Signal Semiconductor Providers | Provide sensing interfaces, power management ICs, ADC/DACs, and signal conditioning required to convert physical signals into digital data. | Texas Instruments, Analog Devices, Infineon Technologies |
| Power Semiconductor & Wide Bandgap Players | Develop Si, SiC, and GaN devices for high-efficiency power conversion in EVs, aircraft electrification, marine propulsion, and space systems. | Wolfspeed, onsemi, STMicroelectronics |
| Sensor Manufacturers (Perception Hardware) | Build core sensing modalities (camera, radar, LiDAR, IMU, GNSS, sonar, star trackers) that define system observability and autonomy limits. | Bosch, Continental AG, Velodyne LiDAR, Teledyne Technologies |
| RF & Communication Chip / Module Providers | Provide connectivity hardware (5G, V2X, satellite comms, radar front-ends) enabling communication and extended perception. | Skyworks Solutions, Qorvo, Broadcom |
| FPGA & Reconfigurable Compute Vendors | Supply programmable logic for deterministic, safety-critical and adaptable processing in aerospace, defense, and space systems. | AMD, Intel |
| EDA (Electronic Design Automation) Companies | Provide design, simulation, verification, and sign-off tools spanning chip, package, and PCB levels—critical for hardware validation and production. | Synopsys, Cadence Design Systems, Siemens |
| Foundries & Advanced Packaging Providers | Fabricate semiconductors and provide advanced packaging technologies for high-performance and reliable systems. | TSMC, Samsung Foundry, Intel Foundry Services |
| Vendor | Platform / Kit | Type | Key Components | Target Domain | Notes / Differentiation |
|---|---|---|---|---|---|
| NVIDIA | NVIDIA DRIVE (Orin / Thor) | Full autonomy compute platform | GPU SoC, Tensor cores, CUDA, DriveWorks SDK | Automotive autonomy (L2–L4) | End-to-end AV compute + software stack |
| NVIDIA | Jetson Orin Dev Kit | Embedded AI compute platform | CPU + GPU SoC, camera interfaces | Robotics, drones, edge AI | Widely used for prototyping |
| Qualcomm | Snapdragon Ride | Automotive compute platform | AI accelerator, vision DSP, sensor fusion | Automotive ADAS/AV | Strong power efficiency + integration |
| Intel | Mobileye EyeQ / AV platform | Vision-centric ADAS platform | Vision SoC, camera-based perception software | Automotive ADAS | Camera-first autonomy strategy |
| AMD | Versal Adaptive SoCs | FPGA/ACAP compute platform | FPGA fabric + AI engines | Automotive, aerospace | Deterministic + adaptive compute |
| Texas Instruments | TDA4VM / Jacinto | ADAS processor | Vision DSP, radar processing, safety MCUs | Automotive | Strong functional safety (ISO 26262 focus) |
| NXP Semiconductors | S32V / BlueBox | Automotive compute + networking | Vision SoC, radar processing, CAN/FlexRay | Automotive | Strong vehicle networking integration |
| Bosch | Radar / ADAS platforms | Sensor + ECU systems | Radar, camera, ECU modules | Automotive | Tier-1 integrated sensor + compute solutions |
| Continental AG | Continental ADAS Dev Platform | Sensor fusion system | Radar, LiDAR, camera modules | Automotive | Strong system-level integration |
| Velodyne LiDAR | LiDAR Dev Kits (e.g., Puck) | Sensor dev kits | 3D LiDAR + SDK | Autonomous, robotics | High-resolution 3D perception |
| Ouster | Ouster OS1 / Gemini | LiDAR platform | Digital LiDAR + API | Robotics, industrial | Software-defined LiDAR stack |
| Analog Devices | Radar Development Kits | RF sensing platform | RF front-end + DSP | Automotive, industrial | Strong RF + signal chain expertise |
| Infineon Technologies | AURIX + Radar Kits | Safety MCU + radar | Radar IC + safety MCU | Automotive | Leading safety MCU platform |
| STMicroelectronics | STM32 + Sensor Kits | Embedded sensing platform | MCU + IMU, GNSS, camera | Robotics, IoT | Low-cost prototyping ecosystem |
| Teledyne Technologies | Imaging Sensor Kits | Vision sensing | CMOS sensors, thermal imaging | Aerospace, defense | High-performance imaging |
| Sony | CMOS Image Sensors | Vision sensors | High dynamic range sensors | Automotive, consumer | Dominant in camera sensing |
| Hexagon | Autonomous Sensors | Software + sensors | LiDAR + mapping + analytics | Industrial autonomy | Strong digital twin ecosystem |
| dSPACE | HIL (Hardware-in-the-Loop) systems | Validation platform | Sensor models, ECU integration | Automotive, aerospace | Critical for V&V workflows |
test