This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| en:safeav:maps:detection [2025/10/24 12:40] – kosnark | en:safeav:maps:detection [2026/04/24 09:43] (current) – raivo.sell | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== Object Detection, Sensor Fusion, Mapping, and Positioning ====== | ====== Object Detection, Sensor Fusion, Mapping, and Positioning ====== | ||
| - | {{: | ||
| ===== Object Detection ===== | ===== Object Detection ===== | ||
| Line 98: | Line 97: | ||
| The pipeline operates continuously in real time (typically 10–30 Hz) with deterministic latency to meet safety and control requirements. | The pipeline operates continuously in real time (typically 10–30 Hz) with deterministic latency to meet safety and control requirements. | ||
| + | ===== Sensor Fusion ===== | ||
| + | |||
| + | No single sensor technology can capture all aspects of a complex driving scene under all circumstances, | ||
| + | |||
| + | Each sensor modality has distinct advantages and weaknesses: | ||
| + | * **Cameras** provide high-resolution color and texture information, | ||
| + | * **LiDAR** delivers precise 3D geometry and range data, allowing accurate distance estimation and shape reconstruction, | ||
| + | * **Radar** measures object velocity and distance robustly, even in poor visibility, but has coarse angular resolution and may struggle with small or static objects. | ||
| + | * **GNSS** provides global position but suffers from signal blockage and reflection in e.g., urban canyons, tunnels, and under tree canopies. | ||
| + | * **IMU** provides motion estimation but is prone to drift and accumulated error. | ||
| + | |||
| + | By fusing these complementary data sources, the perception system can achieve redundancy, increased accuracy, and fault tolerance — key factors for functional safety (ISO 26262). | ||
| + | |||
| + | Sensor fusion can be focused on **complementarity** – different sensors contribute unique, non-overlapping information and **redundancy** – overlapping sensors confirm each other’s measurements, | ||
| + | |||
| + | Accurate fusion depends critically on spatial and temporal alignment among sensors. | ||
| + | |||
| + | * **Extrinsic calibration** determines the rigid-body transformations between sensors (translation and rotation). It is typically estimated through target-based calibration (e.g., checkerboard or reflective spheres) or self-calibration using environmental features. | ||
| + | * **Intrinsic calibration** corrects sensor-specific distortions, | ||
| + | * **Temporal synchronization** ensures that all sensor measurements correspond to the same physical moment, using hardware triggers, shared clocks, or interpolation. | ||
| + | |||
| + | Calibration errors lead to spatial inconsistencies that can degrade detection accuracy or cause false positives. Therefore, calibration is treated as part of the functional safety chain and is regularly verified in maintenance and validation routines. | ||
| + | |||
| + | |||
| + | Fusion can occur at different stages in the perception pipeline, commonly divided into three levels: | ||
| + | |||
| + | * **Data-level** fusion combines raw signals from sensors before any interpretation, | ||
| + | * **Feature-level** fusion merges processed outputs such as detected edges, motion vectors, or depth maps, balancing detail with efficiency. | ||
| + | * **Decision-level** fusion integrates conclusions drawn independently by different sensors, producing a final decision that benefits from multiple perspectives. | ||
| + | |||
| + | The mathematical basis of sensor fusion lies in probabilistic state estimation and Bayesian inference. | ||
| + | Typical formulations represent the system state as a probability distribution updated by sensor measurements. | ||
| + | Common techniques include: | ||
| + | * **Kalman Filter (KF)** and its nonlinear extensions, the **Extended Kalman Filter (EKF)** and **Unscented Kalman Filter (UKF)**, which maintain a Gaussian estimate of state uncertainty and iteratively update it as new sensor data arrive. | ||
| + | * **Particle Filter (PF)**, which uses a set of weighted samples to approximate arbitrary non-Gaussian distributions. | ||
| + | * **Bayesian Networks** and **Factor Graphs**, which represent dependencies between sensors and system variables as nodes and edges, enabling large-scale optimization. | ||
| + | * **Deep Learning–based Fusion**, where neural networks implicitly learn statistical relationships between sensor modalities through backpropagation rather than explicit probabilistic modeling. | ||
| + | |||
| + | |||
| + | ==== Learning-Based Fusion Approaches ==== | ||
| + | Deep learning has significantly advanced sensor fusion. | ||
| + | Neural architectures learn optimal fusion weights and correlations automatically, | ||
| + | For example: | ||
| + | * **BEVFusion** fuses LiDAR and camera features into a top-down BEV representation for 3D detection. | ||
| + | * **TransFusion** uses transformer-based attention to align modalities dynamically. | ||
| + | * **DeepFusion** and **PointPainting** project LiDAR points into the image plane, enriching them with semantic color features. | ||
| + | |||
| + | End-to-end fusion networks can jointly optimize detection, segmentation, | ||
| + | However, deep fusion models require large multimodal datasets for training and careful validation to ensure generalization and interpretability. | ||