This shows you the differences between two versions of the page.
| en:safeav:softsys:vvsoftsysmidd [2026/06/17 14:15] – created karlisberkolds | en:safeav:softsys:vvsoftsysmidd [2026/06/17 15:16] (current) – karlisberkolds | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== V&V of Software Systems and Middleware ====== | ====== V&V of Software Systems and Middleware ====== | ||
| + | |||
| + | ===== V&V objectives and evidence chain ===== | ||
| + | |||
| + | Verification asks whether the software artefact was built correctly against its specification. Validation asks whether the right behaviour has been specified and achieved in the intended operational context. In autonomous systems both questions must be answered at multiple levels. Unit tests may show that a planner function satisfies a local requirement, | ||
| + | |||
| + | A useful evidence chain begins with system hazards and operational assumptions. These are translated into software safety requirements, | ||
| + | |||
| + | ===== Requirements and architecture V&V ===== | ||
| + | |||
| + | Requirements verification starts before code exists. Requirements must be unambiguous, | ||
| + | |||
| + | Architectural verification uses reviews, interface analysis, failure-mode analysis, threat modeling, timing analysis and safety analysis to determine whether the structure can satisfy the requirements. It should examine partitioning between safety-critical and non-safety functions, freedom from interference, | ||
| + | |||
| + | ===== Implementation, | ||
| + | |||
| + | Implementation-level verification examines source code, generated code, models and configuration files. Common methods include peer review, static analysis, coding-standard compliance, unit testing, structural coverage, model checking where feasible, and toolchain qualification when tools can introduce or fail to detect errors. Unit verification is necessary but not sufficient because autonomous behaviour arises from interactions among many components. | ||
| + | |||
| + | Integration V&V tests whether independently verified components work together correctly. For middleware, this includes message compatibility, | ||
| + | |||
| + | ===== Simulation, HIL and scenario validation ===== | ||
| + | |||
| + | Autonomous-system validation depends on progressive movement from models to real systems. Model-in-the-loop tests verify algorithms against mathematical models. Software-in-the-loop tests execute production or near-production software in simulated environments. Processor-in-the-loop tests add target instruction-set or processor effects. Hardware-in-the-loop tests run real controllers or compute platforms against simulated sensors, actuators and plant dynamics. Field trials and operational pilots then validate behaviour in the physical environment. | ||
| + | |||
| + | This progression reduces risk but does not remove uncertainty. Simulation fidelity is limited by environment models, sensor models, traffic or mission models and assumptions about rare events. HIL may represent timing accurately but not the full physical world. Field testing is realistic but cannot cover all combinations of weather, traffic, faults, human behaviour and cyber conditions. Scenario validation should therefore be risk-based and traceable to hazards, operational design domain boundaries and known limitations. | ||
| + | |||
| + | {{: | ||
| + | |||
| + | ===== Configuration, | ||
| + | |||
| + | A software release is not only code. It is a configuration baseline containing requirements, | ||
| + | |||
| + | Operational V&V extends assurance after deployment. Monitoring should detect deadline misses, degraded sensors, software restarts, communication failures, unusual scenario distributions, | ||
| + | |||
| + | ===== Standards, assurance cases, limitations and risks ===== | ||
| + | |||
| + | Standards such as IEC 61508, ISO 26262, DO-178C, ISO/ | ||
| + | |||
| + | Software V&V has inherent limitations. Exhaustive testing is infeasible for complex distributed autonomy. Timing measurements are workload-dependent. Simulation is limited by model fidelity. Formal methods are limited by assumptions and scalability. Field testing is expensive and cannot cover all rare events. Standards reduce process risk but do not guarantee that requirements are complete or that operational assumptions remain valid. Residual risk must be managed through architectural containment, | ||
| + | |||
| + | ===== Metrics, reviews and acceptance criteria ===== | ||
| + | |||
| + | Software V&V needs measurable acceptance criteria, but metrics must be interpreted in relation to the safety argument. Defect counts, code coverage, test pass rates and static-analysis warnings are useful management indicators; they are not direct measures of safety. A release with high statement coverage may still lack tests for hazardous scenarios, while a release with many low-severity warnings may be safer than one with fewer but unresolved timing or configuration risks. | ||
| + | |||
| + | For middleware and runtime platforms, useful metrics include end-to-end latency distributions, | ||
| + | |||
| + | Acceptance criteria should be risk-based. A non-safety display defect may be accepted with a workaround, while an intermittent stale-data condition in a control path should block release until understood and mitigated. The release decision should record residual risks, known limitations, | ||
| + | |||
| + | ===== Practical V&V planning checklist ===== | ||
| + | |||
| + | A practical V&V plan begins by listing software functions that can contribute to hazards. For each function, the plan identifies the responsible architectural element, relevant requirements, | ||
| + | |||
| + | The plan should define integration milestones. Early milestones can focus on models, unit tests and interface contracts. Middle milestones should add middleware stress tests, timing measurement, | ||
| + | |||
| + | Finally, the plan should define how evidence changes after deployment. Connected autonomous systems can receive new code, calibration, | ||
| + | |||
| + | <table Ref.Tab.5.5> | ||
| + | < | ||
| + | |||
| + | ^ Planning question ^ Evidence to request ^ | ||
| + | | What hazards can this software influence? | ||
| + | | What assumptions does the middleware make? | Latency budgets, freshness limits, QoS settings, clock synchronisation requirements. | | ||
| + | | How is the release reconstructed? | ||
| + | | How are AI components controlled? | Dataset and model versioning, scenario results, robustness tests, runtime monitor evidence. | | ||
| + | | What happens after deployment? | Monitoring plan, incident review process, update impact analysis and rollback procedure. | | ||
| + | |||
| + | |||
| + | </ | ||
| + | |||
| + | ===== Method limitations and residual-risk control ===== | ||
| + | |||
| + | No single V&V method is sufficient for software systems and middleware. Reviews are effective for finding requirement ambiguity and architectural gaps, but they depend on reviewer expertise. Static analysis can find many implementation defects, but it does not validate operational behaviour. Unit tests provide local confidence, but they do not expose distributed timing faults. Simulation explores many scenarios safely, but its results are only as credible as its models. Field trials reveal real-world behaviour, but they cannot cover all rare combinations of faults, weather, users and environments. | ||
| + | |||
| + | For this reason, residual risk must be managed by combining evidence and by designing systems that remain safe when evidence is incomplete. Runtime monitors can detect stale data, confidence loss, missed deadlines and inconsistent sensor streams. Degraded modes can reduce speed, request human takeover, hold position, return to base or enter a safe stop. Redundant channels can cross-check critical functions. Secure update and rollback mechanisms can correct faults without introducing uncontrolled changes. | ||
| + | |||
| + | The safety case should make these limits explicit. It should state what has been verified, what has been validated, what assumptions remain, which assumptions are monitored during operation and what authority is responsible for responding when evidence changes. This closes the loop between pre-release V&V and operational safety management. | ||
| + | |||
| + | |||