Journals & Magazines >IEEE Transactions on Intellig... >Volume: 26 Issue: 3

Scanning the Issue

<< Results

Download PDF
Download References
Request Permissions
Save to
Alerts

Metadata

Published in: IEEE Transactions on Intelligent Transportation Systems ( Volume: 26, Issue: 3, March 2025)

Page(s): 2814 - 2832

Date of Publication: 03 March 2025

ISSN Information:

DOI: 10.1109/TITS.2025.3540181

Contents

A Multifaceted Analysis of Intelligent Vehicle Route Optimization

Pooja and S. K. Sood

This article presents a comprehensive survey of the current research progress in the ICT-assisted vehicle route optimization domain. The scientometric analysis is performed on the basis of five major categories, viz., computational intelligence techniques, artificial intelligence, location-based services, pervasive computing, and communication technologies, offering valuable insights and visual maps of vehicle route-optimization related literature that aids researchers and software practitioners in comprehending the research landscape, collaborating community, and intellectual base. A perspective of research themes is given, early-stage key technologies are identified, and possible research directions are discussed. Furthermore, appropriate algorithms for various vehicle routing problems are given, and insights from significant articles for the researchers are provided.

A Guide to Image-and Video-Based Small Object Detection Using Deep Learning: Case Study of Maritime Surveillance

A. M. Rekavandi, L. Xu, F. Boussaid, A.-K. Seghouane, S. Hoefs, and M. Bennamoun

As small objects occupy only a small area in the input image, the information extracted from such a small area is not always rich enough to support decision-making. As a result, generic object detectors often fail to accurately localize and identify such objects (e.g., pedestrians, small vehicles, and obstacles). In this article, the authors provide a comprehensive review of over 160 research papers published between 2017 and 2022 in order to survey this growing subject. This article summarizes the existing literature and provides a taxonomy that illustrates the broad picture of current research. They further explore methods to boost the performance of small object detection in maritime settings, where enhanced performance is crucial for ensuring safety and managing traffic. In addition, the popular SOD datasets for generic and maritime applications are discussed, and also well-known evaluation metrics for the state-of-the-art methods on some of the datasets are provided.

Uncertainty Quantification for Safe and Reliable Autonomous Vehicles: A Review of Methods and Applications

K. Wang, C. Shen, X. Li, and J. Lu

In the past decade, deep learning has been widely applied across various fields. However, its applicability in open-world scenarios is often limited due to the lack of quantifying uncertainty in both data and models. In recent years, a multitude of uncertainty quantification (UQ) approaches for neural networks have emerged and found applications in safety-critical domains such as autonomous vehicles and medical analysis. This article aims to review the latest advancements in UQ methods and investigate their application specifically in the field of computer vision and autonomous vehicles. Initially, the authors identify several key qualifications, namely, practicability, robustness, accuracy, scalability, and efficiency (referred to as PRASE), and employ them as evaluation criteria throughout this study. By considering these criteria as uniform measurements, they meticulously evaluate and compare the performance of different types of UQ methods, including Bayesian methods, ensemble methods, and single deterministic methods. Furthermore, they delve into the discussion of their application in diverse tasks within the autonomous vehicle domain, such as semantic segmentation, object detection, depth estimation, and end-to-end control. Through comprehensive analysis and comparison, they identify a range of challenges and propose future research directions in this field. The findings shed light on the importance of addressing UQ in deep learning models and provide insights into enhancing the reliability and performance of autonomous vehicles in real-world scenarios.

Recent Estimation Techniques of Vehicle-Road-Pedestrian States for Traffic Safety: Comprehensive Review and Future Perspectives

C. Tian, C. Huang, Y. Wang, E. Chung, A.-T. Nguyen, P. K. Wong, W. Ni, A. Jamalipour, K. Li, and H. Huang

Accurate and real-time acquisition of vehicular system dynamic states, road surface conditions, and motion states of surrounding participants is crucial for the safety, passenger comfort, and operational efficiency of autonomous vehicles (AVs) and connected automated vehicles (CAVs). In recent years, a significant amount of research has contributed to the field of state estimation for vehicles, roads, and pedestrians. From the systemwide perspective of intelligent transportation systems to a focused view on “vehicle-road-pedestrian,” this survey aims to provide a comprehensive review and summary of recent state estimation techniques for vehicle motion, road surface, and pedestrian motion. A thorough analysis of the reviewed literature, relevant datasets, evaluation metrics, and experimental platforms in this field is also conducted. Finally, existing challenges and future research directions about methods and performance evaluation are further discussed.

Advancing Vulnerable Road Users Safety: Interdisciplinary Review on V2X Communication and Trajectory Prediction

B. Abdi, S. Mirzaei, M. Adl, S. Hidajat, and A. Emadi

V2X communication systems and deep learning-based trajectory prediction (TP) models synergize to enhance road safety for vulnerable road users (VRUs). V2X facilitates real-time data exchange between vehicles, infrastructure, and road users, while TP predicts VRU movements, identifying hazards early to prevent collisions. This integration forms a cooperative safety network: V2X broadcasts collision warnings, while TP reduces bandwidth demands by processing data locally and transmitting only critical insights. This improves communication efficiency and tackles challenges like delays and scalability in dense traffic. In this survey article, the authors explored V2X standards capabilities and TP advancements, showcasing their combined potential to improve VRU safety. They discussed their complementary roles in leveraging real-time communication and predictive analytics to build safer, smarter roads.

An Efficient Deep Spatio-Temporal Context Aware Decision Network (DST-CAN) for Predictive Manoeuvre Planning on Highways

J. Chowdhury, S. Sundaram, N. Rao, and N. Sundararajan

The safety and efficiency of an autonomous vehicle (AV) manoeuvre planning heavily depend on the future trajectories of surrounding vehicles. If an AV can predict its surrounding vehicles’ future trajectories, it can make safe and efficient manoeuvre decisions. In this article, the authors present a deep spatio-temporal context-aware decision network (DST-CAN) for predictive manoeuvre decisions for AVs on highways. DST-CAN has two main components, namely, spatio-temporal context-aware map generator and predictive manoeuvre decisions engine. DST-CAN employs a memory neuron network to predict the future trajectories of its surrounding vehicles. Using look-ahead prediction and past actual trajectories, a spatio-temporal context-aware probability occupancy map is generated. These context-aware maps as input to a decision engine generate a safe and efficient manoeuvre decision. Here, CNN helps extract feature space, and two fully connected networks generate longitudinal and lateral manoeuvre decisions. Performance evaluation of DST-CAN has been carried out using two publicly available NGSIM US-101 and I-80 highway datasets. A traffic rule is defined to generate ground truths for these datasets in addition to human decisions. Two DST-CAN models are trained using imitation learning with human driving decisions from actual traffic data and rule-based ground truth decisions. The performances of the DST-CAN models are compared with the state-of-the-art convolutional social-LSTM (CS-LSTM) models for manoeuvre prediction. The results clearly indicate that the context-aware maps help DST-CAN to predict the decision accurately over CS-LSTM. Furthermore, an ablation study has been carried out to understand the effect of prediction horizons of performance and a robustness study to understand the near collision scenarios over actual traffic observations. The context-aware map with a 3-s prediction horizon is robust against near collision.

Calibration-Free Driver Drowsiness Classification With Prototype-Based Multi-Domain Mixup

D.-Y. Kim, D.-K. Han, J.-H. Jeong, and S.-W. Lee

A calibration-free EEG-based framework is proposed to classify driver drowsiness, addressing subject variability without requiring calibration The framework introduces prototype-based multi-domain mixup (PDMix) to generate unseen domains, enhancing the diversity of training data, and applies auxiliary batch normalization (ABN) to distinguish features from different domains and prevent inaccurate statistical estimation. In leave-one subject-out cross-validation, the proposed framework achieved outstanding performance in both datasets, an $F1$ -score of 62.69% and 70.33% and an area under the receiver operating characteristic curve (AUROC) of 71.73% and 73.80%, respectively. The experimental results demonstrate the potential for practical applications of brain-computer interfaces without calibration.

Design of a Cost Effective Spatial Image Registration System for Augmented Reality in Vehicular Applications

M. Corno, L. Franceschetti, and S. M. Savaresi

Augmented reality (AR) has the potential to enhance the driving experience by improving the driver’s situational awareness. The technique used to create holograms that appear anchored in specific positions in the real world is known as image registration. For effective AR applications precise head tracking is essential. This article addresses low-cost ground vehicles and proposes a solution that eliminates the need for aerospace-grade inertial measurement units, making it easy to integrate into standard cars. The proposed method, tested on a racing circuit, relies on passive markers and utilizes stereoscopic detection to accurately identify the road plane, anchoring the AR features effectively.

Adaptive Haptic Assistance Control Considering Individual Driver’s Arm Characteristics

H. Zhang, Y. Li, W. Zhao, W. Quan, and C. Wang

An adaptive haptic assistance control scheme is proposed to improve human-vehicle cooperation and enhance driver confidence in advanced driver assistance systems (ADASs). An expert driver model using a multi-layer feed-forward neural network (MLFN) is used to generate reference steering angles for the controller. The nonsingular fast terminal sliding mode (NFTSM) is employed to compute the assistance torque, ensuring fast convergence and robustness of the system. The individual driver’s arm characteristics are investigated and identified online. By incorporating the real-time individual driver’s arm characteristics, the controller offers personalized torque assistance, helping typical drivers achieve expert-level trajectory-tracking performance while reducing their steering workload.

Optimized Long Short-Term Memory Network for LiDAR-Based Vehicle Trajectory Prediction Through Bayesian Optimization

S. Zhou, I. Lashkov, H. Xu, G. Zhang, and Y. Yang

This study proposes a systematic approach for LSTM-based vehicle trajectory prediction using light detection and ranging (LiDAR) data, addressing the limitations of existing methods that rely heavily on user expertise or involve subjective hyperparameter selection. Bayesian optimization is employed to automatically determine the optimal hyperparameters for both the training process and LSTM network architectures. The proposed deep learning-based optimization framework is evaluated using a custom vehicle trajectory dataset extracted from roadside LiDAR data, as well as the V2X-Seq-TFD dataset. The optimized LSTM network, obtained through Bayesian optimization, is compared against two benchmark models: a handcrafted LSTM network and a Kalman filter with a 2-D constant velocity motion model. The results demonstrate that the proposed framework consistently outperforms the benchmark models, delivering more accurate and reliable vehicle trajectory predictions.

A Data-Driven Dynamics Simulation Model for Railway Vehicles Based on Lightweight 3DCNN With Physics-Informed Constraints

Z. Zheng, C. Yi, and J. Lin

The dynamics simulation of complex railway vehicles requires a dedicated vehicle model, such as a multi-body dynamics model. However, the multi-body model is time-consuming in long-distance simulation due to its computational complexity. This issue can be alleviated by using a data-driven vehicle dynamics model due to its effective generalization and computational speed. First, the construction of the physical model of the vehicle system is carried out to obtain the coupling relationship between the components. Second, the coupling relationship between the components is embedded into the loss function of the deep neural network as physics-informed constraints. Furthermore, the network parameters satisfying certain physical laws are obtained by minimizing the loss function. Finally, the proposed lightweight 3-D convolutional neural network is used to predict the vibration state of the vehicle system. The dynamic response resulting from both the data-driven simulation model and the multi-body simulation model are investigated and compared. The simulation results show that the data-driven dynamics simulation model can accurately predict the vibration state of the vehicle system. The data-driven simulation model has a smaller size and faster operation speed, which can be applied to long-distance prediction research of vehicle systems.

Curricular Subgoals for Inverse Reinforcement Learning

S. Liu, Y. Qing, S. Xu, H. Wu, J. Zhang, J. Cong, T. Chen, Y.-F. Liu, and M. Song

Inverse reinforcement learning (IRL) aims to reconstruct the reward function from expert demonstrations to facilitate policy learning, and has demonstrated its remarkable success in imitation learning. To promote expert-like behavior, existing IRL methods mainly focus on learning global reward functions to minimize the trajectory difference between the imitator and the expert. However, these global designs are still limited by the redundant noise and error propagation problems, leading to the unsuitable reward assignment and thus downgrading the agent capability in complex multi-stage tasks. In this article, the authors propose a novel curricular subgoal-based IRL (CSIRL) framework, that explicitly disentangles one task with several local subgoals to guide agent imitation. Specifically, CSIRL first introduces the decision uncertainty of the trained agent over expert trajectories to dynamically select specific states as subgoals, which directly determines the exploration boundary of different task stages. To further acquire local reward functions for each stage, they customize a meta-imitation objective based on these curricular subgoals to train an intrinsic reward generator. Experiments on the D4RL and autonomous driving benchmarks demonstrate that the proposed methods yield results superior to the state-of-the-art counterparts, as well as better interpretability. The code is publicly available at https://github.com/Plankson/CSIRL.

Binocular-Separated Modeling for Efficient Binocular Stereo Matching

Y. Peng, J. Xu, G. Cao, and R. Zeng

To improve the accuracy and efficiency of binocular stereo matching, a lightweight binocular-separated feature extraction module is proposed that includes a view-shared multi-dilation fusion module and a view-specific feature extractor. A lightweight binocular-separated model is established for efficient binocular stereo matching. Both shared and unique features of the left and right viewing images are extracted to guarantee image-matching accuracy compared to deep neural networks. A multi-scale correlation modeling module is built to calculate the correlation between left and right features, dynamically constraining the construction of the cost volume to improve matching accuracy. The experiments show that the proposed method outperforms the deep model-based baseline method while using fewer parameters, and achieves superior matching performance in weak texture and edge regions.

Time-Aware and Direction-Constrained Collective Spatial Keyword Query

Z. Feng, G. Li, J. Li, C. Jin, and X. Du

This article pioneers the study of the time-aware and direction-constrained collective spatial keyword query (TDCoSKQ). To facilitate direction-related operations, space objects are organized using the polar coordinate system. An efficient space partition method is initially designed, upon which a new hybrid index structure, KRPQT, is developed. Based on KRPQT, several pruning strategies are proposed to prune irrelevant regions and objects from the perspective of keyword, time, and direction, and the basic algorithm KRPQB is proposed. Furthermore, the possible regions of result objects are analyzed and reduced, significantly decreasing the number of candidate regions and candidate results. Building on this, three optimization algorithms KRPSW, KRPSW+LFO, and KRPSW+LFRP are proposed. The proposed algorithms can also be extended to handle TDCoSKQ queries under other distance functions and TDCoSKQ queries with weighted objects.

OST-HGCN: Optimized Spatial-Temporal Hypergraph Convolution Network for Trajectory Prediction

X. Lin, Y. Zhang, S. Wang, Y. Hu, and B. Yin

This article introduces OST-HGCN, an optimized hypergraph convolutional network for pedestrian trajectory prediction, essential for applications like autonomous driving and traffic management. OST-HGCN models multi-agent interactions using spatio-temporal hypergraphs, enabling fine-grained motion analysis and high-order interactions. Integrated with a CVAE-based framework, it predicts plausible trajectories more effectively. Experiments on NBA, NFL, SDD, and ETH-UCY datasets confirm its superior performance.

An Empirical Study of Ground Segmentation for 3-D Object Detection

H. Yang, D. Liang, Z. Liu, J. Li, Z. Zou, X. Ye, and X. Bai

The ratio of foreground and background points directly impacts the accuracy and speed of the lidar-based 3-D object detection methods. However, existing methods generally ignore the impact of ground points. Although some traditional ground segmentation algorithms are available to remove ground point clouds, they usually suffer from over-segmentation, which leads to a sub-optimal and even worse performance for the downstream 3-D detection task. The authors conduct an in-depth analysis and attribute this phenomenon to the reason that some crucial foreground points attached to the ground (e.g., the wheels of cars, or the feet of pedestrians) are directly removed due to over-segmentation. To this end, they propose a new attached point restoring (APR) module to recover these discarded foreground points. The experimental results demonstrate the effectiveness and generalization of APR by integrating it into various ground segmentation algorithms to boost the performance or the running time of 3-D detection on KITTI and Waymo datasets. Finally, they hope this article can serve as a new guide to inspire future research in this field. The code is available at https://github.com/yhc2021/GPR.

Co-Evolving Traffic State Parameters Prediction Based on Mechanism-Data Blending Driven Deep Learning

H. Dong, H. Zhang, F. Ding, and H. Tan

A mechanism-data blending driven co-evolving traffic state parameter prediction method, multi-parameters hybrid tensor deep learning networks (MHT-Nets) is proposed, which implements knowledge embedding of synergistic mechanisms between traffic parameters and learn the road network spatial dependency and the synergistic influence relationship of the parameters simultaneously. The experiment results demonstrate the efficacy of the proposed method and provide an effective tool for traffic state prediction with missing values.

Optimized Feature Points and Keyframe Methods for VSLAM in High-Dynamic Indoor Environments

Z. Hu, W. Qi, K. Ding, H. Qi, Y. Zhao, X. Zhang, and M. Wang

This article addresses challenges in traditional VSLAM algorithms caused by dynamic object movements, occlusions, and appearance changes in non-static environments. A novel method is proposed that combines YOLOv7-tiny and LK optical flow to detect and remove dynamic feature points. An adaptive threshold keyframe selection technique improves keyframe quality, while a dynamic keyframe sequence based on angular differences enhances loop closure efficiency. In addition, a ParC_NetVLAD algorithm is developed for robust image matching using ConvNeXt-Tiny, ParC-Net, and CBAM. Experiments demonstrate significant performance improvements, reducing ATE by 96.4% and RPE by 82.8% in dynamic environments and increasing loop closure accuracy by 2.6% on Pittsburgh30k.

Stochastic Calibration of Automated Vehicle Car-Following Control: An Approximate Bayesian Computation Approach

J. Jiang, Y. Zhou, G. Jafarsalehi, X. Wang, S. Ahn, and J. D. Lee

This article presents a stochastic calibration method based on approximate Bayesian computation (ABC). This method is applied to calibrate two car-following control models: linear control and model predictive control (MPC). The method is likelihood-function-free, where the likelihood function is replaced by simulation to approximate the posterior distribution of model parameters. This structure affords the flexibility to calibrate posterior joint distributions of complex models, even those without analytical closed forms such as MPC. Two experiments were conducted to evaluate how well the proposed method reproduces: i) marginal and joint distributions of model parameters, using synthetic data and ii) vehicle trajectories (acceleration, speed, and position), using field data involving two commercial adaptive cruise control (ACC) systems. The results showed that the ABC method can reproduce marginal and joint distributions reasonably well for the linear controller as well as the non-analytical MPC-based controller, which was previously infeasible. The method can also robustly characterize the commercial ACC behavior at the trajectory level, which suggests that the simple linear controller better describes their behavior.

An Accelerated Filter for Critical Scenario Identification in Automated Driving Function Testing: A Model-Free Approach

J. Hu, T. Xu, X. Yan, H. Wang, and J. Lai

This article proposes an enhanced filter for critical scenario identification, which bears the following features: i) automated-driving-function-specific scenario identification; ii) high coverage of critical scenarios; iii) enhanced identification efficiency by avoiding adopting a surrogate model; and iv) high reliability of critical scenario identification. To enable the above features, the proposed filter formulates the identification problem into an optimization problem and solves it with a model-free approach. Experiments have been conducted to evaluate and validate the proposed filter. The results confirm that this filter is able to improve coverage, efficiency, and reliability of critical scenario identification compared to the state-of-the-art filter.

Service-Oriented Edge Collaboration: Digital Twin Enabled Edge Collaboration for Composite Services in AVNs

Y. Hui, X. Ma, C. Li, N. Cheng, R. Chen, Z. Yin, T. H. Luan, and G. Mao

A digital-twin-enabled edge collaboration scheme for composite services is proposed. A coalition game is designed to determine the optimal service composition form for each basic service and a Stackelberg game is designed to measure the performance of the formed coalition structure. The simulation results demonstrate that the proposed scheme can bring the highest utilities to both the service requesters and service providers.

Multimodal Transport Scheme Optimization and Capacity Allocation Considering Customer Classification

B. Han, Y. Chi, Y. Xu, and Y. Park

This study addresses the underdeveloped multimodal transport operation system by integrating customer classification and transportation solution optimization. The authors developed a joint optimization model for multimodal transport schemes, capacity allocation, and pricing based on customer demand characteristics and price sensitivities. A hybrid particle swarm algorithm was used to solve the model, which showed that customer classification strategies significantly improve profit by 7% when demand is unstable. The findings provide valuable insights for multimodal operators to systematically handle customer classification issues and optimize transportation schemes, thereby enhancing profits and offering decision-making references in multimodal operation management.

Enhancing Cyclist Safety Through Driver Gaze Analysis at Intersections With Cycle Lanes

J. A. Abbasi, A. Parsi, N. Ringelstein, P. Reilhac, E. Jones, and M. Glavin

Dedicated cycle lanes in urban areas enhance cyclist safety, but accidents still occur at intersections, particularly when drivers fail to notice cyclists while turning. A study monitored drivers navigating five intersections using non-invasive eye-tracking technology and vehicle sensors. The results revealed that 83% of drivers neglected to check their wing mirrors before or during turns, endangering cyclists, pedestrians, and others. An algorithm analyzing driver gaze patterns during turns identified unsafe behavior. These findings can inform the improvement of advanced driver-assistance systems (ADASs) to promote safer roads for all users.

A Preset-Time Method for Multi-Robot Coordination With Application to Package Delivery

W. Zhang and G. Hu

This article proposes time-homogeneous and time-heterogeneous preset-time algorithms to solve the package delivery problem. Specifically, the considered package delivery problem is formulated by two phases: “Active phase” and “Sleeping phase,” over which vehicles use the neighboring information to achieve the coordination. Tools from the Lyapunov-based method are used to derive the condition guaranteeing the coordination of vehicles, under a very mild communication topology requirement. It is also demonstrated that coordination among vehicles is determined by the maximal preset time. Numerical examples are used to illustrate the utility of the proposed preset-time coordination algorithms and the validity of the derived theoretical finding.

Enhancing Autonomous Driving Decision: A Hybrid Deep Reinforcement Learning-Kinematic-Based Autopilot Framework for Complex Motorway Scenes

Y. Lu, H. Ma, E. Smart, and H. Yu

Autonomous vehicles (AVs) face persistent challenges in intelligence, safety, and reliability in complex motorway scenarios. This study presents a hybrid autopilot framework combining deep reinforcement learning (DRL) with traditional methods to address these limitations. The framework integrates three modules: i) DRL develops adaptable and scalable driving policies for diverse scenarios; ii) a kinematic-based co-pilot strategy improves training efficiency and supports flexible decision-making; and iii) a rule-based system evaluates and finalizes actions in real-time, enhancing overall safety. The simulation results demonstrate the framework’s superiority over the baseline model in training efficiency, intelligence, safety, and reliability, offering a promising solution for advanced autonomous driving systems.

A Sparse Cross Attention-Based Graph Convolution Network With Auxiliary Information Awareness for Traffic Flow Prediction

L. Chen, Q. Zhao, G. Li, M. Zhou, C. Dai, Y. Feng, X. Liu, and J. Li

This article introduces AIMSAN, a deep encoder-decoder model designed to address computational and scalability challenges in traffic prediction using graph convolutional networks (GCNs). AIMSAN integrates an auxiliary information-aware module (AIM) and a sparse attention-based graph network (SAN). AIM efficiently embeds historical and future auxiliary data, such as weather and holidays, into tensors, while SAN employs cross-attention and diffusion GCN to capture spatial-temporal dynamics. By leveraging traffic node sparsity, AIMSAN significantly reduces the quadratic computational complexity of GCNs. The extensive experimental results show that AIMSAN achieves comparable performance with state-of-the-art methods, while significantly reducing computational time and memory overhead of the model. This advancement highlights AIMSAN’s potential for scalable and resource-efficient traffic prediction tasks.

CAN-Trace Attack: Exploit CAN Messages to Uncover Driving Trajectories

X. Lin, B. Ma, X. Wang, G. Yu, Y. He, W. Ni, and R. P. Liu

The article highlights significant privacy vulnerabilities in modern vehicles’ controller area network (CAN), a critical communication protocol used in automatic vehicles. It introduces the CAN-Trace, a novel mechanism that reconstructs detailed driving trajectories by leveraging CAN messages without relying on traditional GPS data. The research makes several key contributions, including the development of a trajectory reconstruction algorithm that transforms CAN messages into weighted graphs representing driving statuses. It also applies advanced graph-matching techniques to accurately map these trajectories onto road networks and introduces a new evaluation metric designed to handle data gaps and inconsistencies. By addressing these issues, the article underscores the pressing need for enhanced privacy and security measures in intelligent transportation systems, particularly in the context of vehicle communication networks.

Secure Authentication and Trust Management Scheme for Edge AI-Enabled Cyber-Physical Systems

X. Xiang, J. Cao, and W. Fan

This article proposes a lightweight decentralized authentication and trust management scheme for edge AI-enabled CPSs supporting access control based on extended chaotic maps, which meets the privacy and security requirements of data transmission in a broader sense. The scheme can be used to check the credibility of data collected by smart devices/sensor nodes. The security analysis of the solution shows that it can withstand various well-known attacks in CPS environments. The performance evaluation and experimental results demonstrate that the proposed scheme is more secure when compared to existing schemes in terms of security aspects and performance.

Predicting Pedestrian Crossing Intentions in Adverse Weather With Self-Attention Models

A. Elgazwy, K. Elgazzar, and A. Khamis

The article presents a novel framework for predicting pedestrian crossing intentions in adverse weather conditions using a transformer-based architecture integrated with an image enhancement pipeline. Addressing limitations in robustness and inference time of existing vision-based models, the framework enhances image quality using classical enhancement techniques before feeding data into a self-attention transformer network. Visual and non-visual features, including bounding box coordinates, pose key points, and vehicle speed, are fused through hierarchical and total fusion methods for optimal prediction accuracy. Evaluated on the JAAD dataset, the model achieves state-of-the-art performance with improved accuracy and reduced inference time. Real-time deployment tests further confirm the framework’s efficiency, offering a reliable solution for assisted and automated driving systems under challenging environmental conditions.

SmartRail: A System for the Continuous Monitoring of the Track Geometry Based on Embedded Arrays of Fiber Optic Sensors

G. Santamato, L. Tozzetti, M. Solazzi, E. Fedeli, and F. Di Pasquale

In this work, the authors propose the concept of the smart rail, an innovative system for the continuous monitoring of the track geometry based on embedded arrays of fiber Bragg grating sensors and Raman-based distributed temperature sensors. First, they discuss how the technology design, based on a custom metallic patch embedding the FBG sensors and brazed on the track, overcomes the robustness concerns of the state-of-the-art. The metrological principle is formulated based on an analytical/FE model allowing the correlation of the measured signals to the local curvature deformation of the rail, and then to reconstruct the global track geometry. The effect of spatial sampling on the detection of even short-wave defects is addressed through simulations, as being a crucial trade-off between effectiveness and complexity. The experimental results performed on the first prototype demonstrate an efficient strain transfer with excellent agreement with the theoretical predictions. Hence the proposed technology seems very promising for the next generation of monitoring systems, in terms of robustness and compatibility with maintenance operations.

Repeated Route Naturalistic Driver Behavior Analysis Using Motion and Gaze Measurements

B. Adhikari, Z. Durić, D. Wijesekera, and B. Yu

This article introduces the Repeated Route Naturalistic Driving Dataset (R2ND2), a novel dataset for driver behavior analysis (DBA) featuring vehicular and calibrated driver gaze data collected across diverse traffic scenarios. Data collection utilized driver-facing IR cameras, a CAN bus decoding sensor, and a high-resolution RGB. The major contributions of R2ND2 are: multi-module data analysis with diverse features that allow for accurate driver identification (98.35% with existing models). Influence of traffic conditions, route familiarity, and driving experience on the driver behavior. Key findings include: novice drivers exhibit higher risk perception and focus on a narrow field of view, especially while navigating at high speed or on a highway. Experienced drivers maintain consistent focus across diverse traffic scenarios. Drivers with intermediate experience show inconsistencies, most noticeably increasing in fixations while driving through familiar routes. In the future, R2ND2 will facilitate further research in driver behavior analysis and driver-focused ADAS systems.

CCLDet: A Cross-Modality and Cross-Domain Low-Light Detector

X. Shang, N. Li, D. Li, J. Lv, W. Zhao, R. Zhang, and J. Xu

At present, the main solution is to improve the detection performance in low light by fusing with infrared images. However, the current methods often overlook the impact of illumination changes on RGB images, and ignore the important role of high-frequency information for object detection, especially for low-light target detection. In this article, the authors propose a cross-modality and cross-domain low-light detector (CCLDet) for low-light vehicle detection. Extensive experiments on three challenging RGB-infrared object detection datasets demonstrated the mAP and the parameter quantities of CCLDet over popular object detectors.

Framework of Adaptive Driving: Linking Situation Awareness, Driving Goals, and Driving Intentions Using Eye-Tracking and Vehicle Kinetic Data

H.-Y. Lai

This study introduces the framework of adaptive driving (FAD), which defines five driving goals by integrating the skill-rule-knowledge (SRK) model with proactive or reactive strategies. Eye-tracking and kinetic data were analyzed to examine how situational awareness (SA) and driving intentions are formed. An exploratory factor analysis identified eight factors, classified as SA- or maneuver-related. “Cognitive load” reflects SRK-related mental activity, while “Saccade on the surroundings” and “Saccade movement” distinguish proactive from reactive strategies. Variations in “Saccade movement” indicate diverse considerations across the driving goals. “Active acceleration” occurs in non-risk contexts, whereas “Deceleration” addresses emerging risks. In risk scenarios, “Steering strategy” suggests steering when SA is sufficient, linked to “Front observation,” “Saccade movement,” and “Cognitive load.” Under extreme urgency, “Lateral movement” replaces “Steering strategy,” indicating abrupt steering without sufficient SA. These findings highlight key physiological features supporting AI-driven identification of drivers’ emerging needs for future driver assistance systems.

Electric Vehicle Routing Optimization for Postal Delivery and Waste Collection in Smart Cities

M. A. d. Cacho Estil-les, A. M. Mangini, M. Roccotelli, and M. P. Fanti

An optimization of postal delivery and waste collection in smart cities is addressed using electric vehicle routing problems. Two mixed integer linear programming models are proposed to minimize route length while respecting working time and battery charge constraints. The models incorporate smart charging strategies to reduce power grid demand peaks at both district and charging station levels, making them suitable for large-scale applications. To manage the complexity of the problem, a heuristic algorithm combining clustering and routing strategies is developed. Two case studies demonstrate the effectiveness of the proposed models for optimizing logistics in urban environments.

A Real-Time Terrain-Adaptive Local Trajectory Planner for High-Speed Autonomous Off-Road Navigation on Deformable Terrains

S. Yu, C. Shen, J. Dallas, B. I. Epureanu, P. Jayakumar, and T. Ersal

A terrain-adaptive local trajectory planner for the autonomous operation of off-road vehicles on deformable terrains is introduced. The approach integrates an optimal-control-oriented terramechanics model to account for terrain deformation and employs a terrain estimator using the unscented Kalman filter to dynamically adjust the sinkage exponent online. Extensive simulations and real-world experiments validate the formulation against rigid-terrain benchmarks, demonstrating superior safety and performance. The results show notable improvements, highlighting the importance of incorporating terramechanics knowledge into trajectory planning for deformable terrains.

Knowledge Guided Visual Transformers for Intelligent Transportation Systems

A. Belhadi, Y. Djenouri, A. N. Belbachir, T. Michalak, and G. Srivastava

The authors propose a novel approach for computer vision tasks in intelligent transportation systems, emphasizing data security through federated learning. The method utilizes visual transformers, training multiple models for each image and storing visual features and loss values. They introduce a Shapley value model based on model performance consistency to select optimal models during testing. For enhanced security, they employ a federated learning strategy, clustering users with contrastive clustering to create both global and customized local models. Users receive both global and local models for tailored computer vision applications. Evaluating knowledge guided visual transformers for ITS (KGVT-ITS) on pedestrian detection, abnormal event, and near-crash detection challenges, they demonstrate its superior performance, with a notable 8% improvement over existing ITS methods.

Recovering Crowd Trajectories in Invisible Area of Camera Networks

Y. Li, W. Wu, H. Zhao, Y. Shi, and Y. Lyu

This article addresses the challenge of recovering crowd trajectories in the blind area of sparse camera networks in crowded public spaces. Traditional multi-object tracking (MOT) methods rely on appearance or spatial-temporal features to follow individuals across cameras. However, these methods struggle in dense environments due to cluttered appearances and highly uncertain movements. This study proposes a novel approach that reduces reliance on appearance features, achieving better spatial temporal feature matching by estimating the most likely travel time between segmented tracklet observations of individuals with elaborate consideration of pedestrian interactions. Subsequently, trajectories of matched tracklets in the blind area are recovered with a high-fidelity crowd simulation model. Experiments on real-world datasets demonstrate that this method outperforms existing spatial-temporal-based MOT models, and improves the appearance-based MOT models in terms of association accuracy and trajectory fidelity in crowded settings.

Tightly-Coupled 6DoF Localization in Complex Environments With GNSS Raw Data

Y. Shi, B. Lian, Y. Zeng, and E. Kurniawan

A tightly coupled framework based on nonlinear optimization is proposed for vision, LiDAR, inertial, and GNSS raw data to enhance the robustness of six-degree-of-freedom (6DOF) pose estimation for autonomous systems in complex environments. The article validates the effectiveness of the proposed optimization factor model in improving position and orientation estimation accuracy for GNSS, LiDAR, and visual data through simulations. In addition, several real-world datasets are used to compare the proposed algorithm with several existing open-source programs, evaluating its performance in terms of computational efficiency, pose estimation accuracy, worst-case scenarios, and reliability. The experimental results show that, although the total processing time increases, the proposed fusion algorithm improves position and orientation estimation accuracy by at least 58.0%.

Continuous Berth Allocation and Time-Variant Quay Crane Assignment: Memetic Algorithm With a Heuristic Decoding Method

L.-S. Xu, T. Huang, B.-W. Zhao, Y.-J. Gong, and J. Liu

Addressing the continuous berth allocation and time-variant quay crane assignment problem (C/T-V BACAP), this study introduces a novel memetic algorithm, named HMA, which effectively enhances container terminal operations. The proposed HMA introduces a three-stage heuristic decoding method, a clustering-based evolutionary strategy, and a target-guided local search operator. The experimental results confirm HMA’s superior performance, significantly improving operational efficiency over existing methods.

Novel Finite-Time Controller With Improved Auxiliary Adaptive Law for Hypersonic Vehicle Subject to Actuator Constraints

Y. Ding, X. Yue, W. Li, P. Huang, and N. Li

A novel adaptive finite-time controller is proposed for flexible air-breathing hypersonic vehicles with actuator saturations. First, an adaptive dynamic inversion control is presented for the velocity subsystem. The influence of actuator saturation is solved by an improved auxiliary adaptive law (IAAL). Compared with conventional adaptive law, the IAAL can achieve a faster convergent speed of tracking error and weaken dramatical change for control signal effectively. Second, an adaptive continuous sliding mode control is designed for the height subsystem, in which an integral sliding surface is established based on a continuous fast higher-order sliding mode algorithm (CFSMA). Compared with the conventional finite-time high-order regulator, CFSMA can drive states to converge faster and adjust the respond speed of the system conveniently without complicated parameters selection. Ultimately, with the aid of a novel adaptive finite-time controller, the flexible air-breathing hypersonic vehicle subject to actuator constraints can achieve faster convergent speed and higher tracking precision compared with existing conventional adaptive controllers.

Panoramic Sea-Ice Map Construction for Polar Navigation Based on Multi-Perspective Images Projection and Camera Poses Rectification

R. Lu, J. Shang, J. Wu, Y. Wang, and D. Ma

A novel online panoramic method for polar sea ice mapping is proposed, overcoming traditional challenges related to parallax tolerance, stitching robustness, and mapping accuracy. The approach integrates a dynamic inverse projection module, a planar feature registration technique, and a camera pose rectification module, which together improve the quality of map stitching during ship movement. By utilizing local map fusion, the method enables real-time restoration of bird’s-eye view (BEV) sea-ice images and facilitates global sea-ice map construction. The experimental results demonstrate enhanced map accuracy and stitching performance compared to existing methods, making the approach well-suited for polar SLAM and path planning applications.

Event-Triggered Self-Organizing Swarm Control of Distributed Unmanned Surface Vehicles

N. Wang, W. Jia, H. Wu, and Y. Wang

Aiming at autonomous massive transportation by sea, an economically condition-based cooperative control solution remains unrevealed and is highly desirable for the collective swarming of distributed unmanned surface vehicles (USVs) suffering from narrow-band communication and unstructured unknowns. In this article, an event-triggered self-organizing swarm control (ESSC) scheme is innovated to flexibly helm a herd of USVs, and features main contributions as follows: 1) a suite of self-organizing swarm mechanism consisting of aggregation, collision avoidance and heading alignment is holistically established, such that emerging behaviors of swarm kinetics can be self-evolved for flexible morphology; 2) within adaptive dynamic programming framework, an event-triggered optimal solution to USV swarm control is worked out by deriving optimization-oriented event-triggering mechanism from swarm kinetics tracking errors, thereby making a rational balance between channel occupation and tracking accuracy; and 3) approximately optimal control actions are acquired by employing actor-critic reinforcement learning networks to solve Hamilton-Jacobi–Bellman equation, thereby assuring communication parsimony and control optimality, simultaneously. Performance validations with intensive comparisons to time-triggered methods demonstrate the effectiveness and superiority in terms of tracking accuracy, channel occupancy, and control optimality, in addition that extensive application to roundup scenarios showcases the proposed ESSC scheme performs feasible extension to wide-range tasks.

Game Projection and Robustness for Game-Theoretic Autonomous Driving

M. Liu, H. E. Tseng, D. Filev, A. Girard, and I. Kolmanovsky

Game-theoretic decision-making has the potential to bring human-like reasoning skills to autonomous vehicles (AVs), fostering trust between humans and AVs. However, to make these approaches sufficiently practical for real-world use, challenges such as game complexity and incomplete information have to be addressed. Game complexity refers to the difficulties in solving a game-theoretic problem, which includes solution existence, algorithm convergence, and scalability. The authors show in the recent work that a possible solution to overcoming these difficulties is to use potential games. However, constructing a potential game often requires specific cost-function designs, limiting their broad use. To address this challenge, they propose to employ a game projection technique in this article, relaxing the cost function design conditions and making the potential game approach applicable to broader scenarios, even including the ones that cannot be modeled as a potential game. Incomplete information refers to the ego vehicle’s lack of knowledge of other traffic agents’ cost functions. In a driving scenario, deviations of the ego vehicle assumed/estimated others’ cost functions from their actual ones are often inevitable. This necessitates the robustness analysis of a game-theoretic solution. This article defines the robustness margin of a game solution as the maximum magnitude of cost function deviations that can be accommodated without changing the optimality of the game solution. With this definition, closed-form robustness margins are derived. Numerical studies using highway lane-changing scenarios are reported.

Dynamic Control Authority Allocation in Indirect Shared Control for Steering Assistance

Y. Chen, H. Zhang, H. Chen, J. Huang, B. Wang, Z. Xiong, Y. Wang, and X. Yuan

This study presents a novel dynamic control authority allocation method for shared control in autonomous vehicles, enhancing human-machine interaction. Unlike traditional mixed-initiative control, which uses fixed weights for human and vehicle inputs, the proposed approach employs optimization-based techniques to dynamically allocate control authority, ensuring safety and optimal performance. A convex quadratic program (QP) is formulated, incorporating control barrier functions (CBFs) for safety and control Lyapunov functions (CLFs) to meet automated control objectives. The cost function is designed to increase the human weight based on input magnitude, while smooth transitions are achieved by optimizing the change rate of the weight. The method is validated through human-in-the-loop (HmIL) and hardware-in-the-loop (HdIL) experiments in lane-changing scenarios. The results demonstrate that the proposed method outperforms index-based approaches, offering superior agility, safety, and comfort in autonomous vehicle control.

A Dual Function Intelligent Reflecting Surface in Integrated Radar Communication System

D. Bao and R. Guo

A novel intelligent reflecting surface (IRS) aided dual-function radar-communication (DFRC) system is proposed in this article. IRSs are not only used to improve communication channels, but also employed as cooperative intelligent targets to enhance some objects of interest, such as vehicles and pedestrians in an intelligent transportation system. The optimal design objective function is to maximize the radar performance, while keeping the communication rate constant. The measure of radar performance is the Cramér-Rao bound (CRB) of the direction of departure (DOD), which is a special capability of a collocated antenna array multiple-input multiple-output (MIMO) radar. Simulation experiments show the ability of the proposed optimization algorithm to balance the radar performance and the communication rate.

Optimizing Mixed Traffic Flow: Longitudinal Control of Connected and Automated Vehicles to Mitigate Traffic Oscillations

C. Liu, F. Zheng, H. X. Liu, and X. Liu

This article presents a traffic oscillation mitigation-oriented optimal control framework for connected and automated vehicles (CAVs) in a mixed traffic environment where the behavior of human-driven vehicles (HVs) is unknown. The primary objective of this framework is to alleviate traffic oscillations, thereby improving overall traffic flow. To achieve this, the authors introduce a novel total equilibrium spacing estimation method, incorporating stochastic parameters into a car-following model and quantifying the deviation between the mean and equilibrium spacing. This estimation, integrated with a jam-absorption driving strategy, is embedded into a model predictive control (MPC) model with the objective of mitigating traffic oscillations. The efficacy of the proposed control method is evaluated through two experiments utilizing real vehicle trajectory datasets. The first experiment focuses on a single CAV, exploring the impact of key controller parameters on oscillation mitigation. The results demonstrate the optimal performance of the proposed oscillation mitigation-based MPC (OM-MPC) model, even with a shorter CAV distance (e.g., 100 m), revealing a positive correlation between CAV distance and suitable preset oscillation duration. The second experiment extends the investigation to multiple stop-and-go shockwaves and varying CAV penetration rates. A comparative analysis of control models, including OM-MPC, regular MPC, and proportional-integral with saturation, is conducted based on velocity mean (VM), road segment congestion index (RI), and vehicle stop times (VSTs). The findings underscore the effectiveness of the proposed control method in mitigating traffic oscillations and enhancing overall traffic efficiency, establishing it as the optimal choice among the three approaches.

Dynamic Route Optimization With Multi-Category Constraints for POIs Visit

J. Li, C. Liu, D. He, L. Li, X. Zhou, and R. Zhu

The article introduces dynamic route optimization with multi-category constraints (DROMCs) for optimizing travel time in visiting points of interest (POIs), considering spatial and temporal factors. A novel path enumeration algorithm is proposed, modifying the k-fastest paths approach to satisfy user-specific requirements under time and POI availability constraints. Efficiency is bolstered by adapting the KSP algorithm for POI paths, implementing a binary-encoded shared prefix tree (SPFT) for data handling, employing a grid-based heuristic, and deploying pruning techniques for expediting calculations and accommodating POI operation times. Experiments show the proposed method outperforms existing ones, providing more time-efficient routing solutions.

DAGCAN: Decoupled Adaptive Graph Convolution Attention Network for Traffic Forecasting

Q. Yuan, J. Wang, Y. Han, Z. Liu, and W. Liu

It is necessary to establish a spatio-temporal correlation model in the traffic data to predict the state of the transportation system. Existing research has focused on traditional graph neural networks, which use predefined graphs and have shared parameters. However, intuitive predefined graphs introduce biases into prediction tasks and the fine-grained spatio-temporal information cannot be obtained by the parameter-sharing model. In this article, the authors consider it crucial to learn node-specific parameters and adaptive graphs with complete edge information. To show this, they design a model based on a graph structure that decouples nodes and edges into two modules. Each module extracts temporal and spatial features simultaneously. The adaptive node optimization module is used to learn the specific parameter patterns of all nodes, and the adaptive edge optimization module aims to mine the interdependencies among different nodes. Then, they propose a decoupled adaptive graph convolution attention network for traffic forecasting (DAGCAN), which relies on the above two modules to dynamically capture the fine-grained spatio-temporal relationships in traffic data. The experimental results on four public transportation datasets, demonstrate that the model can further improve the accuracy of traffic prediction.

Effective Learning Mechanism Based on Reward-Oriented Hierarchies for Sim-to-Real Adaption in Autonomous Driving Systems

Z. Hong

Considering the difficulty of sim-to-real adaption in intelligent transportation systems is “catastrophic forgetting,” which involves the inability to retain previously learned skills and affects learning in an inefficient way. This article could tackle the problem by taking advantage of reconfigurable Sim2Real policies from simpler, previously learned sub-tasks. Such a learning mechanism breaks down the behavior-aware experience into two distinguished types: basic task-agnostic background and dynamic object-specific foreground. It further reveals the intrinsic association between previously learned knowledge and time-varying events in the real world according to the reuse of skill motion via mirrored composition. Extensive validation on both simulated and real-world Sim2Real testbench of challenging autonomous driving scenarios outperforms, demonstrating the superiority of the proposed learning mechanism in improving task efficiency and handling stochasticity throughout learning.

Revealing Trip Purposes in Raw GPS Data by Applying a Multi-Phase Clustering Approach to Semantic Trajectories

J. Hamann and T. Hagen

This study presents a multi-phase clustering method to identify trip purposes in passively collected raw GPS data of vehicles. The proposed approach uses only unsupervised models, eliminating the need for labeled data. The method includes a data-driven city segmentation, enrichment of explanatory variables characterizing trip purposes, and a multi-level clustering approach using an algorithm that can simultaneously process both numerical and semantic variables. The analysis identifies twelve different types of urban areas and eight different trip purposes across 170000 trips, with the results validated by a comprehensive comparison with the results of a national travel survey in Germany.

GFA-SMT: Geometric Feature Aggregation and Self-Attention in a Multi-Head Transformer for 3D Object Detection in Autonomous Vehicles

H. Mushtaq, X. Deng, P. Jiang, S. Wan, M. Ali, and I. Ullah

Three-dimensional object detection by autonomous vehicles is integral to intelligent transportation. Existing systems often compromise essential foreground point features and local spatial interactions through random down-sampling, focusing primarily on local feature extraction. However, this neglects interactions among distant yet significant points, limiting semantic information and detection performance due to inherent point cloud data sparsity. Addressing this, the proposed geometric feature aggregation and self-attention in a multi-head transformer (GFA-SMT) architecture leverages graph convolutional networks and multi-channel transformers to enhance weak semantic information of distant sparse objects. GFA-SMT comprises three modules: distance suppression for local receptive fields (DsLRFs), geometric feature aggregator with multi-head self-attention (GFaSA), and predicted key-point weighting and refinement (PKwR). DsLRF preserves foreground features, GFaSA encodes similar features and aggregates edge features, while PKwR focuses on key-points for enhancing geometric knowledge of distant and sparse objects. Extensive experiments on KITTI, DIARV2X-I, and NuScenes datasets show significant enhancements in widely used techniques, resulting in notable increases in average precision (AP) for 3-D object detection: 4.08%, 5.56%, and 4.62%, respectively, on the KITTI test dataset. GFA-SMT enhances point cloud detection accuracy, particularly at medium and long distances, with minimal impact on run-time performance and model parameters.

MonoAMNet: Three-Stage Real-Time Monocular 3D Object Detection With Adaptive Methods

H. Pan, Y. Jia, J. Wang, and W. Sun

Monocular 3-D object detection finds applications in various fields, notably in intelligent driving, due to its cost-effectiveness and ease of deployment. However, its accuracy significantly lags behind LiDAR-based methods, primarily because the monocular depth estimation problem is inherently challenging. While some methods leverage additional information to aid in network training and enhance performance, they are hindered by their reliance on specific datasets. The authors contend that many components of monocular 3-D object detection lack the necessary adaptability, impeding the performance of the detector. In this article, they propose six adaptive methods addressing issues related to network structure, loss function, and optimizer. These methods specifically target the rigid components within the detector that hinder adaptability. Simultaneously, they provide theoretical insights into the network output and propose two novel regression methods. These methods facilitate more straightforward learning for the network. Importantly, the approach does not depend on supplementary information, allowing for end-to-end training. In comparison with existing methods, the proposed approach demonstrates competitive speed and accuracy. On the KITTI dataset, the method achieves a 17.72% AP3D (IOU =0.7, car, moderate), outperforming all previous monocular methods. In addition, the approach prioritizes speed, achieving a runtime of up to 52 FPS on an RTX 2080Ti GPU, surpassing all previous monocular methods. The source codes are at: https://github.com/jiayisong/AMNet.

Multi-Agent Reinforcement Learning for Cooperative Transit Signal Priority to Promote Headway Adherence

M. Long and E. Chung

The article proposes a cooperative transit signal priority strategy with a variable phase for headway adherence under a multi-intersection network by multi-agent reinforcement learning. It considers four critical aspects, i.e., complicated states with multiple conflicting bus requests, rational actions constrained by domain knowledge, comprehensive rewards balancing buses and cars, and a collaborative training scheme among agents. The simulation results from a three-intersection environment and an entire-line network demonstrate the effectiveness of the proposed method in reducing bus headway deviation and passenger waiting times.

Enhancing Infrared Small Target Detection: A Saliency-Guided Multi-Task Learning Approach

Z. Liu, Y. Zhang, J. He, T. Zhang, S. u. Rehman, M. Saraee, and C. Sun

Object detection in infrared images poses a considerable challenge due to its small-scale targets, low contrast, and poor signal-to-clutter ratio, often resulting in a high false alarm rate. To improve the detection accuracy on infrared small targets, the authors introduce Light-SGMTLM, a lightweight and saliency-guided multi-task learning model. This model integrates saliency detection into the YOLOv5x framework through a parallel multi-task learning structure and employs a joint loss function during training. Such integration significantly alleviates the impact of complex backgrounds and improves the precision of small target localization. Moreover, they have developed a streamlined module, termed SIWD, to create a more agile backbone, which establishes an optimal balance between precision and efficiency, making the model more suitable for situations with limited computational resources. Comprehensive comparative experiments were conducted on six infrared small target datasets, namely, Small-ExtIRShip, Small-SSDD, IHAST, NUAA-SIRST, IRSTD-1k, and IRDST, and they assessed the model’s performance against ten leading target detection models, such as YOLOv7, YOLOv8, DINO, and Relation-DETR. The findings reveal that the method’s unique joint learning architecture, combining saliency and object detection tasks, significantly improves accuracy for infrared small target detection. Notably, it achieved impressive mean average precision (mAP) values of 92.60% and 75.71% on the NUAA-SIRST and IRSTD-1k datasets, respectively.

RSTR: A Two-Layer Taxi Repositioning Strategy Using Multi-Agent Reinforcement Learning

H. Yu, X. Guo, X. Luo, Z. Wu, and J. Zhao

This article proposes the RSTR model, which is a two-layer model that takes advantage of the two mainstream methods. The problem is modeled as a partially observable Markov decision process and the optimization objective is designed. To generate more accurate repositioning strategies and improve model training outcomes, a many-to-many scheduling mode is proposed and its effectiveness is demonstrated. Extensive experiments on real-world datasets show that RSTR can effectively balance supply and demand and outperform other baseline methods.

Traffic Road Visibility Retrieval in the Internet of Video Things Through Physical Feature-Based Learning Network

Y. Wang, L. Zhou, and Z. Xu

A novel framework for retrieving traffic road visibility in the Internet of Video Things (IoVT) is introduced, tackling the challenge of ambiguity in estimating visibility from video frames. By analyzing fog effects, the study reveals that eigenvalues of the observed image matrix approximate airlight components. This insight leads to a four-step framework: defining persistent scatterers, applying singular value decomposition for airlight separation, extracting physical features, and developing a hybrid convolutional LSTM network for precise visibility estimation. The comparative results demonstrate superior performance, with a correlation coefficient of 0.9484 and an averaged root mean square error of 681 m, outperforming Koschmieder law-based, CNN, and deep LSTM methods. Data and code are available at https://github.com/Z-H-XU/Benchmark-Visibility.

Multi-Agent Game Theory-Based Coordinated Ramp Metering Method for Urban Expressways With Multi-Bottleneck

Q. Lin, W. Huang, Z. Wu, M. Zhang, and Z. He

Traditional coordinated ramp metering (CRM) methods generally concentrate on single-bottleneck scenarios, while ignoring the case of multiple bottlenecks. Taking advantage of the automatic vehicle identification (AVI) data and multi-agent deep reinforcement learning, CRM can be improved. This article proposes a distributed CRM strategy with a multi-bottleneck to minimize the total travel time and balance the multiple on-ramps equity, using the individual trajectory information from AVI data. First, the article defines road segment units, road segment groups, and bottlenecks. Next, the problem is formulated as a potential game that captures the interaction among multiple bottlenecks. The controllers utilize the MADDPG algorithm to determine the green duration of the on-ramps. Finally, the proposed strategy is tested on a real-world urban expressway in a microsimulation platform SUMO. The experimental results demonstrate that the proposed strategy performs better than the baseline methods in eliminating mainline congestion and improving the multiple on-ramps equity.

Vision-Language Tracking With CLIP and Interactive Prompt Learning

H. Zhu, Q. Lu, L. Xue, P. Zhang, and G. Yuan

Vision-language tracking is a new rising topic in intelligent transportation systems, particularly significant in autonomous driving and road surveillance. It is a task that aims to combine visual and auxiliary linguistic modalities to co-locate the target object in a video sequence. Currently, multi-modal data scarcity and burdensome modality fusion have become two major factors in limiting the development of vision-language tracking. To tackle the issues, the authors propose an efficient and effective one-stage vision-language tracking framework (CPIPTrack) that unifies feature extraction and multi-modal fusion by interactive prompt learning. Feature extraction is performed by the high-performance vision-language foundation model CLIP, resulting in the impressive generalization ability inherited from the large-scale model. Modality fusion is simplified to a few lightweight prompts, leading to significant savings in computational resources. Specifically, they design three types of prompts to dynamically learn the layer-wise feature relationships between vision and language, facilitating rich context interactions while enabling the pre-trained CLIP adaptation. In this manner, discriminative target-oriented visual features can be extracted by language and template guidance, which are used for subsequent reasoning. Due to the elimination of extra heavy modality fusion, the proposed CPIPTrack shows high efficiency in both training and inference. CPIPTrack has been extensively evaluated on three benchmark datasets, and the experimental results demonstrate that it achieves a good performance-speed balance with an AUC of 66.0% on LaSOT and a runtime of 51.7 FPS on RTX2080 Super.

Two-Echelon Collaborative Location Routing Problem With Intuitionistic Fuzzy Multi-Demands for Sorted-Waste Collection and Transportation

C. Shang, L. Ma, and Y. Gao

This article explores a novel model for sorted-waste transportation, defined as the two-echelon collaborative location routing problem with intuitionistic fuzzy multi-demands. The authors fashion a distributed heuristic based on fuzzy bi-means and adaptive large neighborhood segmented search to address this model. A refined Shapley model for profit allocation and the best coalition combination is constructed for each participant. Extensive computational findings and a practical case study are conducted to show the efficiency of the proposed model and approach. Several relevant managerial insights are also derived to aid decision-making in waste sorting management.

Integrated Optimization on Double-Side Cantilever Yard Crane Scheduling and Green Vehicle Path Planning at U-Shaped Yard

W. Peng, D. Wang, H. Qiu, F. Chu, and Y. Yin

The problem of scheduling and path planning for double-side cantilever yard cranes, automated guided vehicles, and external trucks in U-shaped automated container terminals is addressed. A bi-objective mixed integer programming model is proposed to minimize makespan and energy consumption, incorporating workload balancing, conflict avoidance, and optimal parking slot assignment. An improved multi-objective particle swarm optimization algorithm is developed. The experimental results demonstrate the utility of the proposed model and algorithm.

Toward Proactive-Aware Autonomous Driving: A Reinforcement Learning Approach Utilizing Expert Priors During Unprotected Turns

J. Fan, Y. Ni, D. Zhao, P. Hang, and J. Sun

To address the challenge of autonomous vehicle (AV) interactions with human drivers in ambiguous right-of-way scenarios, a proactive-aware decision-making framework is developed. By merging reinforcement learning (RL) with parameterized modeling, human-expert priors under ambiguous right-of-way are utilized to guide the learning of the RL agent. A human decision-updating mechanism, governed by interpretable parameters derived from expert priors, is introduced into the AV strategy. The proposed method achieves balanced safety and efficiency in tackling ambiguities, with superior decision-making performance via the guidance of expert priors when compared with established baselines. Furthermore, the results indicate that the proposed method enables AVs to accelerate the convergence during the interaction by consistent probing and decision updates.

Unity Is Strength: Unifying Convolutional and Transformeral Features for Better Person Re-Identification

Y. Wang, P. Zhang, X. Liu, Z. Tu, and H. Lu

Person re-identification (ReID) aims to retrieve the specific person across non-overlapping cameras, which greatly helps intelligent transportation systems. As we all know, convolutional neural networks (CNNs) and transformers have unique strengths in extracting local and global features, respectively. Considering this fact, the authors focus on the mutual fusion between them to learn more comprehensive representations of persons. In particular, they utilize the complementary integration of deep features from different model structures. They propose a novel fusion framework called FusionReID to unify the strengths of CNNs and transformers for image-based person ReID. More specifically, they first deploy a dual-branch feature extraction (DFE) to extract features through CNNs and transformers from a single image. Moreover, they design a novel dual-attention mutual fusion (DMF) to achieve sufficient feature fusions. The DMF comprises local refinement units (LRU) and heterogenous transmission modules (HTMs). LRU utilizes depth-separable convolutions to align deep features in channel dimensions and spatial sizes. HTM consists of a shared encoding unit (SEU) and two mutual fusion units (MFUs). Through the continuous stacking of HTM, deep features after LRU are repeatedly utilized to generate more discriminative features. Extensive experiments on three public ReID benchmarks demonstrate that the method can attain superior performances than most state-of-the-art. The source code is available at https://github.com/924973292/FusionReID.

Elevation-Aware Map Matching Model Leveraging Transfer Learning in Sparse Data Conditions

J. Tang, S. Zheng, B. Yu, and X. Liu

This article presents an elevation-aware map matching model (EAM^3) for intelligent urban transportation under sparse data conditions. The model integrates an elevation-aware unit using imagery and sensor data to acquire elevation information for urban roads, improving map-matching accuracy. A transfer learning approach is used to fine-tune the model across domains, reducing development costs. The model is evaluated on real-world datasets with four metrics, showing superior performance in complex urban scenarios. The results highlight the effectiveness of the elevation-aware unit in enhancing model robustness and the significance of elevation data in map matching.

An Interactive Prediction and Planning Method for Lane Change Trajectories

W. Xiong, J. Chen, X. Zhang, Q. Wang, and Z. Qi

When an autonomous vehicle attempts to change lanes, multiple factors must be considered, such as road conditions, and dynamic interactions with other traffic participants. This article introduces a novel lane-changing method that interactively combines prediction and planning to cope with complex traffic scenarios. First, a new target vehicle trajectory prediction network based on the hierarchical attention modules is proposed. The initial predictions are fed into a combined sampling and optimization method for selecting an ego vehicle lane-changing maneuver. Unlike previous unidirectional frameworks, the selected maneuver is re-entered into the prediction network so that extra ego vehicle planning information can be incorporated to reduce prediction uncertainties. Finally, based on the planning-informed predictions and the selected maneuver, the authors design a nonlinear model predictive controller to achieve a safe, efficient, and comfortable lane change trajectory in the Frenet coordinate. The root mean square errors of the proposed prediction network at the fifth second on the NGSIM and HighD test sets are 3.54 and 1.18 m, respectively, which both achieve state-of-the-art performance. Moreover, the results of real traffic data-based simulations and real-vehicle experiments highlight the effectiveness of the lane-changing framework.

JTE-CFlow for Low-Light Enhancement and Zero-Element Pixels Restoration With Application to Night Traffic Monitoring Images

C. Hu, Y. Hu, L. Xu, Y. Guo, Z. Cai, X. Jing, and P. Liu

The authors observe that the low-light RGB images, as well as night traffic monitoring (NTM) images, contain lots of color pixels with zeros caused by the low-light, which means that the low-light images suffer both information weakness and information loss of zero-element pixels. In this article, they propose a novel flow-based generative method JTE-CFlow for low-light image enhancement, which consists of a joint-attention transformer-based conditional encoder (JTE) and a map-wise cross-affine coupling flow (CFlow). Specifically, JTE executes short-range and long-range operations by RRDBs (i.e., residual-in-residual dense blocks) and JATs (i.e., joint-attention transformer blocks) in series connection. JAT achieves weak information amplification and information loss restoration of zero-element pixels by the integration of self-attention and specific-attention with sharing the same value vectors, where the query and key vectors of specific-attention are from the zero-map feature of the low-light image. On the other hand, CFlow develops a map-wise cross-affine coupling (MCAC) layer to perform cross-learning for the flow feature, and a multiplication coupling network (MCN) to learn the transformation parameters of MCAC. JTE-CFlow learns to map the subtraction of outputs of CFlow and JTE (i.e., the residual code) into a standard normal distribution, and the inverse network of CFlow takes the latent feature of the low-light image as its input to infer the enhanced image. Experiments show that JTE-CFlow outperforms most SOTA methods on 7 mainstream low-light datasets with the same architecture, and can be applied to enhance NTM images. The source code and pre-trained models are available at https://github.com/NJUPT-IPR-HuYin/JTE-CFlow.

Energy Efficient Beamforming Optimization for Integrated On-Demand Sensing and Communication in High-Speed Railway Mobile Networks

T. Du, X. Fang, and L. Yan

Unlock the future of high-speed railway safety and communication with the innovative integrated on-demand sensing and communication (IDSAC) mechanism! This approach optimizes energy efficiency by dynamically balancing sensing and communication needs in real-time. By integrating beamforming optimization with age of information (AoI) constraints, IDSAC ensures timely and accurate sensing updates while maintaining efficient communication. This solution addresses the unique challenges of high-speed railway mobile networks, improving both operational safety and energy performance.

Coordinated Battery Charging and Swapping Scheduling of EVs Based on Multilevel Deep Reinforcement Learning for Urban Governance

B. Zhang, Z. Chen, L. Zang, P. Guo, and R. Miao

A multilevel deep reinforcement learning method is proposed to coordinate the actions of EVs within the battery charging and swapping station (BCSS) environment. The initial decisions for EVs are provided by the multi-agent advantage actor-critic (MAA2C) model. Then, an advantage value-based algorithm is employed to address the constraints of limited charging and swapping equipment. Moreover, an action-driven dynamic simulation environment, which incorporates both charging and swapping modes, is developed to provide essential data support for the proposed model training and simulation. The MAA2C can provide real-time charging and battery-swapping strategies for individual EVs, while maintaining stability in large-scale scenarios.

Intelligent Reflecting Surface and Network Slicing Assisted Vehicle Digital Twin Update

L. Li, L. Tang, Y. Wang, T. Liu, and Q. Chen

This article proposes a method to ensure the isolation of vehicle digital twins with the assistance of intelligent reflecting surface and network slicing, and obtain better update time of the VDTs within limited resources. Considering the dynamic variability of vehicles and the environment, this article proposes an improved deep reinforcement learning algorithm based on the actor-critic framework to allocate communication, computing resources, and adjust the phase shift of the IRS. A large number of simulation results indicate that the proposed algorithm performs better than the benchmark algorithms.

Day-to-Day Integrated Optimization of Bus Transit Maintenance and Vehicle Scheduling

Y. He, W. Zhang, T. Liu, J. Ma, and H. Sun

The study addresses the integrated bus maintenance and vehicle scheduling problem in a day-to-day dynamic operation setting to ensure prompt treatment of both regular preventative and irregularly predictive maintenance needs. The problem is formulated as a mixed-integer linear programming model aiming to minimize total operational costs and risks over a day-to-day rolling planning horizon. A two-stage decomposition (TSD) method is proposed to solve this challenging problem. Computational experiments conducted on a real-world bus line demonstrate its effectiveness and superiority. Compared to the one-step solution method and another competitor in a related study, the proposed TSD solution method consumes far less computation time, while delivering high-quality solutions. Moreover, the TSD solution method can be further significantly accelerated by implementing physically parallel computation in the second stage. The computational results also highlight the benefits of the proposed two-stage optimization approach in enhancing operational cost-efficiency and bus resource utilization.

Neuro-Adaptive Formation Tracking for Networked Autonomous Surface Vehicles Under Time Delay via Hierarchical Information Security Control

X.-Y. Zhang, T. Han, B. Xiao, and H. Yan

This article investigates the formation tracking control problem for autonomous surface vehicles (ASVs) with dynamic uncertainties and external disturbances under secure and privacy-preserving interaction. An innovative hierarchical information security control (HISC) framework is proposed to solve the estimation problem in a secure and privacy-preserving way and the formation tracking problem for ASVs. The information processing layer of the HISC framework focuses on the distributed secure and privacy-preserving estimator (DSPE) algorithm under sampled-data interaction and the local control layer is mainly about the robust neuro-adaptive controller without any model information for the formation of networked ASVs under communication delay. Through systematic analysis, sufficient conditions are given for guaranteeing the stability and convergence of the studied closed-loop system. Ultimately, simulation outcomes are showcased to corroborate the efficacy of the proposed control scheme.

PoTC: A Proof of Traffic-Flow Condition Consensus for Secure and Efficient Blockchain in the Internet of Vehicles

Y. Zhao, N. Ding, Y. Hao, and L. Xu

The Internet of Vehicles faces communication and efficiency challenges with traditional blockchain consensus mechanisms. Committee-based byzantine fault tolerance improves scalability but is vulnerable to manipulation by malicious committee members. This article proposes the proof of traffic-flow condition (PoTC) consensus to address these issues. PoTC uses traffic-data-driven grouping to form committees, enhancing scalability and communication efficiency while maintaining fault tolerance. A robust node assessment model provides reliability scores for participants. PoTC’s security is validated through analysis against two adversary models: invalid voting and traffic-flow data tampering. Extensive experiments demonstrate PoTC’s efficiency and resilience against malicious node interference, making it a promising solution for secure and efficient blockchain in the Internet of Vehicles environments.

GRNN Model With Feedback Mechanism Incorporating k-Nearest Neighbor and Modified Gray Wolf Optimization Algorithm in Intelligent Transportation

X. Wu, J. Zhan, W. Ding, and W. Pedrycz

The existing methods for intelligent transportation prediction have the phenomenon of losing some important information in the prediction process. Moreover, the randomness of k-value selection in k-nearest neighbor (KNN) leads to some limitations in the prediction performance of intelligent transportation prediction. To address these challenges, this article designs a dynamic prediction system of generalized regression neural network (GRNN) with feedback mechanism by virtue of KNN and modified grey wolf optimization algorithm (MGWO), named KNN-MGWO-FMGRNN. First, the discrepancy value between different samples for each feature is calculated and the k nearest samples to a certain sample are taken as the discrepancy result of this sample under this feature. Then, the KNN results are fused with the difference results and combined with the MGWO algorithm to obtain the optimal k-value. Furthermore, the optimal FSS and model learning are completed simultaneously using FMGRNN.

Simulating Pedestrian Flow on Slopes via Transfer Learning Approach: From Single-File to Crowd

W. Xie, N. Jiang, Y. Ma, E. W. M. Lee, X. Li, and H. Yu

A transferable pedestrian motion simulation network (TPMSN) is proposed for accurately modeling pedestrian dynamics on slopes. The TPMSN incorporates five key input features to capture relative position information, neighbor motion states, and trends in pedestrian motion. Through a transferable layer, the network demonstrates versatility in handling crowd simulation tasks across different slope angles. Simulations carried out by the two networks demonstrate promising accuracy and authenticity. In addition, the networks could capture pedestrian lateral body sway, a crucial aspect of real-life pedestrian behavior on slopes, as evidenced by lane entropy trends consistent with empirical studies. Overall, the TPMSN offers a successful approach for crowd simulation on slopes.

Context-Aware Knowledge Graph Framework for Traffic Speed Forecasting Using Graph Neural Network

Y. Zhang, Y. Wang, S. Gao, and M. Raubal

This article introduces a novel context-aware knowledge graph (CKG) framework to enhance traffic speed forecasting by effectively modeling spatial and temporal contexts. Using a relation-dependent integration strategy, the framework generates context-aware representations that capture the intricate spatio-temporal dependencies of urban environments. Building on this foundation, a CKG-GNN model is developed, integrating the CKG, dual-view multi-head self-attention mechanisms, and graph neural networks. This integration not only significantly improves predictive accuracy but also underscores the importance of contextual information in forecasting traffic dynamics. By bridging domain knowledge with graph neural architectures, the proposed approach demonstrates its potential for advancing intelligent transportation systems.

Human-Like Interactive Lane-Change Modeling Based on Reward-Guided Diffusive Predictor and Planner

K. Chen, Y. Luo, M. Zhu, and H. Yang

Lane changing presents a dynamic scenario characterized by intricate interactions among vehicles. Within a mixed-autonomy traffic environment, modeling a human-like lane-change trajectory enables human drivers to better understand and predict autonomous vehicles’ behaviors, thereby enhancing road safety and travel efficiency. In this study, the authors achieve human-like interactive lane-change modeling based on a novel framework named Diff-LC. The human-like modeling of LCV behaviors relies on an advanced diffusive planner, and the implemented trajectory is selected based on the recovered LCV reward function learned through multi-agent adversarial inverse reinforcement learning (MA-AIRL). To account for interactions between FVs and LCVs, they further employ a diffusive predictor to forecast future behaviors of FVs conditioned on both historical and planned trajectories. In addition, they leverage the recovered reward function of FVs to enable controllable prediction of trajectories. In the experimental part, they begin by analyzing the significance of features in the recovered reward functions and then proceed to compare the distinctions between the LCV and the FV. To validate the effectiveness of the proposed framework, they compare the diffusive predictor and planner with several state-of-the-art methods. The results demonstrate that motions planned by Diff-LC closely reach the intended positions with small displacement errors and exhibit highly similar speed and jerk distributions to those of human drivers. They also conduct a dynamic simulation to evaluate Diff-LC’s performance across different traffic conditions. Finally, they explore customized generation using the diffusion posterior sampling method. The codes can be found at https://github.com/zeonchen/Diff-LC/.

D-TLDetector: Advancing Traffic Light Detection With a Lightweight Deep Learning Model

Y. Huang and F. Wang

A lightweight model for traffic light detection is proposed to address the challenges of balancing detection speed and accuracy in intelligent driving systems. The model integrates structured reparameterization and lightweight vision transformers in the backbone to enhance informational richness and positional awareness. A low-GD neck architecture improves multi-scale feature integration and reduces information loss. Data augmentation using stable diffusion generates diverse weather scenarios, enhancing model robustness. The model achieves 135 FPS, 98.23% accuracy, and only 1.3M parameters on the YCTL2024 dataset, demonstrating strong generalization on the Bosch Small Traffic Lights Dataset.

Efficient Federated Connected Electric Vehicle Scheduling System: A Noncooperative Online Incentive Approach

S. Zhang and S. Zhang

As one of the most promising elements in intelligent transportation systems (ITSs), connected electric vehicles (CEVs) can be collectively utilized to improve the quality of essential transportation services. However, involving CEVs in providing vehicle-to-grid (V2G) services becomes a crucial problem since they are selfish and belong to different parties. To solve this problem, the authors propose an efficient federated CEV scheduling framework that implements a noncooperative online incentive approach. Case studies assess the feasibility and effectiveness of the proposed noncooperative incentive approach, in which the efficient motivation of the CEVs contributes to high-quality V2G services. In addition, the use of sufficient online parking allocation methods can further increase the quality of V2G services.

Multi-Hop RIS-Aided Learning Model Sharing for Urban Air Mobility

K. Xiong, H. Yu, S. Leng, C. Huang, and C. Yuen

Urban air mobility (UAM), powered by flying cars, is poised to revolutionize urban transportation by expanding vehicle travel from the ground to the air. This advancement promises to alleviate congestion and enable faster commutes. However, the fast travel speeds mean vehicles will encounter vastly different environments during a single journey. As a result, onboard learning systems need access to extensive environmental data, leading to high costs in data collection and training. These demands conflict with the limited in-vehicle computing and battery resources. Fortunately, learning model sharing offers a solution. Well-trained local deep learning (DL) models can be shared with other vehicles, reducing the need for redundant data collection and training. However, this sharing process relies heavily on efficient vehicular communications in UAM. To address these challenges, this article leverages the multi-hop reconfigurable intelligent surface (RIS) technology to improve DL model sharing between distant flying cars. The authors also employ knowledge distillation to reduce the size of the shared DL models and enable efficient integration of non-identical models at the receiver. The approach enhances model sharing and onboard learning performance for cars entering new environments. The simulation results show that the scheme improves the total reward by 85% compared to benchmark methods.

Decoupling Objectives for Segmented Path Planning: A Subtask-Oriented Trajectory Planning Approach

G. Liao, C. Fu, Y. Yu, K. Lai, B. Xia, and J. Xia

In this article, the path planning task is decoupled into two separate subtasks. An artificial potential field based on vehicle vertices is constructed, and an effective transit point selection approach is utilized. The optimization problem is transformed into a multi-attribute decision-making problem for simplifying the solving process. Besides, a velocity planner is devised to generate cubic-polynomial velocity profile candidates, and an SVM-based stability classifier is constructed to determine the optimal profile. The proposed method has been verified through simulation and real vehicle experiments, confirming its capacity to generate a safe, comfortable, efficient, and trackable trajectory.

CDRP3: Cascade Deep Reinforcement Learning for Urban Driving Safety With Joint Perception, Prediction, and Planning

Y. Yang, F. Ge, J. Fan, J. Zhao, and Z. Dong

A new cascade deep reinforcement learning framework, CDRP3, is introduced to enhance the safety decision-making of self-driving vehicles in complex scenarios and emergencies. The framework incorporates a multi-modal spatio-temporal perception module that integrates sensor data from cameras and LiDAR to capture dynamic environmental information, while a future state prediction module models and forecasts the interactions with other traffic participants. In addition, the PPO-based planning module utilizes the comprehensive environmental information obtained from perception and prediction to optimize driving strategies through a lateral and longitudinal separated multi-branch network structure guided by a customized reward function. By transferring knowledge from perception and prediction to planning, CDRP3 effectively improves driving safety in urban environments.

MIM: High-Definition Maps Incorporated Multi-View 3D Object Detection

J. Xiao, S. Wang, J. Zhou, Z. Tian, H. Zhang, and Y.-F. Wang

This article addresses the underexplored challenge of fusing multi-view images and high-definition (HD) maps for autonomous driving. It highlights the advantages of utilizing HD maps in object detection and proposes a novel method that integrates HD maps into multi-view 3-D detection, effectively resolving major modality discrepancies in view, semantics, and scale. Experiments validate the effectiveness of the proposed camera-map fusion approach and provide a detailed analysis of the role of HD maps in 3-D detection. The code is publicly available to support future research.

A Multi-Vehicle Self-Organized Cooperative Control Strategy for Platoon Formation in Connected Environment

M. Zhang, C. Wang, W. Zhao, J. Liu, and Z. Zhang

A multi-vehicle self-organized cooperative control strategy for platoon formation is proposed, which includes vehicle self-organizing formation control and platoon cooperative merging control. The vehicle self-organizing formation control module organizes the merging vehicles within the V2V communication range into multiple local platoons according to the dynamic self-adjusting critical interval. The merging vehicles merge into the target platoon as a whole in the form of a local platoon, which transforms the complex multi-vehicle merge problem into a platoon cooperative control problem and improves the merging efficiency. The platoon cooperative merging control module adopts a distributed model predictive control theory to control the target platoon to split to create a merging gap and the local platoon to merge into the target platoon safely and smoothly. Simulation experiments prove that the proposed control strategy can enable multi-vehicles to merge into a platoon efficiently, safely, and stably.

Switching Dynamic Event-Triggered Sliding Mode Based Trajectory Tracking Control for ASVs With Nonlinear Dead-Zone and Saturation Inputs

G. Zhang, C.-M. Chew, Y. Xu, and M. Fu

This article investigates discrete-time sliding mode trajectory tracking control for fully actuated autonomous surface vessels (ASVs) with unknown nonlinear dead-zone and saturation inputs. By establishing a direct mapping between ASV positions and control inputs, the trajectory tracking design is simplified. Unlike linear dead-zone and saturation input constraints with known parameters, the authors consider a more realistic scenario of unknown nonlinearity, employing adaptive neural networks to approximate and compensate for the resulting unknown dynamics. A novel switching dynamic event-triggered mechanism is proposed to reduce unnecessary data transmission, which switches triggering conditions based on auxiliary dynamic variable variations. Meanwhile, the controller output variation is integrated into the event-triggered conditions to enhance tracking control performance. Based on this, a discrete-time sliding mode trajectory tracking controller suitable for large sampling periods is designed. This ensures satisfactory control effectiveness while further reducing data transmission frequency and conserving communication resources within a larger range of sampling periods.

Multiobjective Vehicle Routing Optimization With Time Windows: A Hybrid Approach Using Deep Reinforcement Learning and NSGA-II

R. Wu, R. Wang, J. Hao, Q. Wu, P. Wang, and D. Niyato

This article proposes a weight-aware deep reinforcement learning (WADRL) approach designed to address the multiobjective vehicle routing problem with time windows (MOVRPTWs), aiming to use a single deep reinforcement learning (DRL) model to solve the entire multiobjective optimization problem. First, the authors design an MOVRPTW model to balance the minimization of travel costs and the maximization of customer satisfaction. Subsequently, they present a novel DRL framework that incorporates a transformer-based policy network. This network is composed of an encoder module, a weight embedding module where the weights of the objective functions are incorporated, and a decoder module. NSGA-II is then utilized to optimize the solutions generated by WADRL. Finally, the extensive experimental results demonstrate that the method outperforms the existing and traditional methods.

Integrating GPU-Accelerated for Fast Large-Scale Vessel Trajectories Visualization in Maritime IoT Systems

M. Liang, K. Liu, R. Gao, and Y. Li

With the advancement of satellite communication technology, the maritime Internet of Things (IoT) has made significant progress. As a result, vast amounts of automatic identification system (AIS) data from global vessels are transmitted to various maritime stakeholders through maritime IoT systems. AIS data contains a large amount of dynamic and static information that requires effective and intuitive visualization for comprehensive analysis. However, two major deficiencies challenge current visualization models: a lack of consideration for interactions between distant pixels and low efficiency. To address these issues, the authors developed a large-scale vessel trajectories visualization algorithm, called the non-local kernel density estimation (NLKDE) algorithm, which incorporates a non-local convolution process. It accurately calculates the density distribution of vessel trajectories by considering correlations between distant pixels. In addition, they implemented the NLKDE algorithm under a graphics processing unit (GPU) framework to enable parallel computing and improve operational efficiency. Comprehensive experiments using multiple vessel trajectory datasets show that the NLKDE algorithm excels in vessel trajectory density visualization tasks, and the GPU-accelerated framework significantly shortens the execution time to achieve real-time results. From both theoretical and practical perspectives, GPU-accelerated NLKDE provides technical support for real-time monitoring of vessel dynamics in complex water areas and contributes to constructing maritime intelligent transportation systems. The code for this article can be accessed at: https://github.com/maohliang/GPU-NLKDE.

Graph Transformer-Based Dynamic Edge Interaction Encoding for Traffic Prediction

N. Ouyang, L. Ao, Q. Cai, W. Wan, X. Ren, X. He, and K. Sheng

Traffic prediction is an essential function of intelligent transportation systems for traffic control and autonomous driving. Most existing methods encode traffic spatial and temporal data separately, and then design a feature fusion module to correlate spatial and temporal features. However, spatial information is often static, and repetitive static spatial encoding leads to a waste of resources, especially in large-scale traffic network prediction. In this article, the authors propose a dynamic edge interaction encoding method for spatio-temporal features based on an inverse transformer (iTransformer) and graph transformer, named iTPGT-former. The dynamic edge interaction process is designed to embed dynamic temporal features into static edges via a convolutional embedding module. To enhance the graph transformer, a relative position encoding strategy based on the self-attentive score of the positive definite kernel (PDK) on graphs and a method for graph substructure encoding (GSE) via enumeration of paths are introduced. In the experimental and discussion session, the iTPGT-former is considered for accuracy, parameters, inference speed, and rich ablation experiments are provided based on six publicly available traffic datasets. The results show that iTPGT-former outperforms the baseline model in both traffic flow and traffic speed prediction. The maximum improvement is achieved in the METR-LA 60-min speed prediction task, with 15.2% reduction in mean absolute percentage error (MAPE). In addition, the inference of the iTPGT-former is significantly faster than the GCN-based method. The implementation of the iTPGT-former is available at https://github.com/ouyangnann/iTPGTN-former.

Event-Triggered Train Formation Control of Multiple Autonomous Surface Vehicles in Polar Communication Interference Environment

R. Liu, W. Zhang, G. Zhang, W. Bai, and D. Chen

This article investigates the event-triggered train formation control problem for multiple autonomous surface vehicles (ASVs) formation systems in a polar communication interference environment. First, a distributed resilient guidance algorithm is introduced to generate the reference route based on waypoints. In the guidance algorithm, the distributed resilient leader predictor (RLP) is applied to obtain the states of ice-breaking ships when communication fails, and the resilient train formation scheme is designed to compute the reference signals for ASVs. Subsequently, an adaptive neural event-triggered train formation control algorithm is developed. In the control algorithm, the neural networks (NNs) are conducted to approximate model uncertainties, and event-triggered control (ETC) is employed to minimize controller updates. Furthermore, the threshold of the event-triggered mechanism (ETM) can be dynamically adjusted by states of system. It is proved that the multiple ASVs system is stable in a polar communication interference environment.

CSNet: Cross-Stage Subtraction Network for Real-Time Semantic Segmentation in Autonomous Driving

M. A. M. Elhassan, C. Zhou, D. Zhu, A. B. M. Adam, A. Benabid, A. Khan, A. Mehmood, J. Zhang, H. Jin, and S.-W. Jeon

A novel cross-stage subtraction network (CSNet) is introduced to address the challenges of real-time semantic segmentation for autonomous driving. CSNet employs a cross-stage subtraction module (CSM) to aggregate multi-scale features across short, medium, and long fusion paths, improving semantic feature refinement and object boundary recognition. In addition, a semantic guided context reasoning (SGCR) module is proposed to model contextual relations and enhance scene understanding. The experimental results demonstrate that CSNet achieves state-of-the-art accuracy and efficiency on benchmarks such as Cityscapes and Camvid, making it highly suitable for diverse urban scenarios.

Prescribed Performance-Based Optimal Formation Control for USVs With Position Constraints and Yaw Angle Time-Varying Partial Constraints

L. Cao, Y. Qin, Y. Pan, and H. Liang

This article considers the prescribed performance-based optimal formation control problem for unmanned surface vehicles. To be more specific, prescribed-time performance constraints are imposed on the position tracking errors between each vehicle and its leader. Then, the prescribed performance-based optimal formation control strategy is developed to guarantee that each vehicle achieves collision-free formation control while maintaining connectivity. Inspired by the prescribed performance control, an improved asymmetric barrier function with prescribed performance is provided to ensure that the yaw angle errors satisfy the prescribed performance constraints. Eventually, theoretical analysis demonstrates that the optimal formation control scheme can produce position tracking errors that converge to a prescribed arbitrarily small region within a prescribed time interval, along with the yaw angle that adheres to the time-varying partial constraints, subject to optimal cost with limited communication ranges and collision avoidance constraints. The simulation results and comprehensive comparisons show extraordinary effectiveness and superiority.

TAURITE: Stackelberg Equilibrium in Blockchained Energynet Through Electric Vehicles

G. Kumar, R. Saha, M. Conti, and J. J. P. C. Rodrigues

The integration of electric vehicles (EVs) into the energynet, the network from power generation to EV charging station, presents a symbiotic relationship with potential benefits for sustainable and efficient transportation. However, the existing research has revealed challenges in maintaining an equilibrium between energy supply and demand, often resulting in underutilization or overutilization of energy networks. Blockchain technology has emerged as a promising solution to enhance transparency and secure decentralized energy distribution; however fails to connect the equilibrium in the presence of uncertainty of demand-supply and/or handling information cascading. In this article, the authors introduce sTAckelberg eqUilibRium in blockchaIned energyneT with Evs (TAURITE), a novel blockchain-based energynet framework that explicitly leverages the Stackelberg model for energy flow equilibrium within EV interfaces. TAURITE employs subgame perfect Nash equilibrium (SPNE) to address demand uncertainty in dynamic vehicular environments. It also tackles information cascades’ impact on energy distribution, demonstrating its ability to maintain equilibrium even in such scenarios. TAURITE introduces a multi-variate polynomial-based key generation process through the smart contract AVTAL and incorporates proof-of-energy-equilibrium (PoEE) as an energy sector consensus mechanism. The experimental results show that TAURITE significantly improves throughput, latency, and energy efficiency, with an average 30% enhancement in these metrics. Notably, TAURITE ensures 100% allocation stability, even in the presence of information cascades, marking a substantial advancement in sustainable and efficient energy management within the evolving energynet-EV ecosystem.

Scheduling for Maximizing the Information Freshness in Vehicular Edge Computing-Assisted IoT Systems

X. Xie, T. Zhong, and H. Wang

Vehicular edge computing (VEC), as an emerging computing paradigm, enables the timely processing of computing tasks at the network edge through on-vehicle servers, thereby meeting users’ demands for information freshness. In this article, the authors introduce the age of information (AoI) to measure information freshness and investigate the scheduling problem minimizing the long-term average AoI in VEC-assisted Internet of Things systems. The main challenge lies in the strong coupling between link scheduling and server selection under the location constraints of VEC. To address this issue, they design a scheduling strategy based on deep reinforcement learning and improve the neural network structure using a branch network approach, reducing complexity by decreasing the number of actions represented in the network’s output layer. Moreover, they introduce an action masking scheme that accelerates the algorithm’s convergence in this system. The numerical results show that the proposed scheduling algorithm can achieve up to a 25.4% performance gain compared to existing advanced algorithms.

A Three-Stage Decision-Making Method Based on Machine Learning for Preventive Maintenance of Airport Pavement

Y. Li, Z. Niu, Y. He, Q. Hu, and J. Zhang

This article proposes a three-stage machine learning-based method for preventive maintenance (PM) decision-making on airport pavements. The first stage involves a PCA-PSO-SVM model for classifying pavements into three maintenance levels: daily, PM, and major. The second stage refines the PM requirements using OPTICS and DBSCAN clustering algorithms. The final stage selects appropriate maintenance measures based on the dominant pavement damage types. The results show that the PCA-PSO-SVM model outperforms the original SVM model, with a significant increase in classification accuracy. Clustering analysis indicates that the OPTICS algorithm performs better than DBSCAN, and four PM need categories were identified through dimensionality reduction and clustering visualization.

Exploring Decision Shifts in Autonomous Driving With Attribution-Guided Visualization

R. Shi, T. Li, Y. Yamaguchi, and L. Zhang

This article introduces an attribution-guided visualization method to explain decision-making in autonomous driving systems. Understanding why models make specific driving decisions remains challenging, as small perceptual differences can lead to decisions that deviate from human reasoning. The authors propose a cumulative layer fusion attribution method to identify key decision-making parameters. These attributions inform visualization optimization by applying weighted parameters to crucial information, ensuring that decision changes stem only from critical modifications. In addition, they implement an indirect regularization method to enhance visualization quality without extra hyperparameters. The experimental results on large datasets show that the approach generates insightful visual explanations and surpasses state-of-the-art methods in both qualitative and quantitative assessments.

Equipping With Cognition: Interactive Motion Planning Using Metacognitive-Attribution Inspired Reinforcement Learning for Autonomous Vehicles

X. Hou, M. Gan, W. Wu, Y. Ji, S. Zhao, and J. Chen

A metacognitive-attribution inspired reinforcement learning (MAIRL) approach is proposed to address interactive motion planning for autonomous vehicles. By integrating the metacognitive theory and attribution theory from the psychology field with reinforcement learning, this study enriches the agent’s learning mechanisms with human cognitive processes to foster a unified cognitive structure and control strategy. Specifically, it applies metacognitive theory’s three core elements—metacognitive knowledge, metacognitive monitoring, and metacognitive reflection—to enhance the control framework’s capabilities in skill differentiation, real-time assessment, and adaptive learning. Furthermore, inspired by attribution theory, it decomposes the reward system in RL algorithms into three components: 1) skill improvement; 2) existing ability; and 3) environmental stochasticity. This interdisciplinary approach not only enhances the understanding and applicability of RL algorithms but also represents a meaningful step toward modeling advanced human cognitive processes in the field of autonomous driving.

Separation and Rendezvous Control With Batteries Replacement for the UAV-USV Ecosystem: A Finite-Time Bipartite Method Under the MPC Structure

S. Li, Y. Zhu, Z. Li, Y. Li, and G. Guo

In most of the recent literature about MPC, the finite-time and bipartite control are rarely considered. Although existing MPC methods can be applied to address the separation and rendezvous problem, they may suffer from time-wasting and consistency conflicts. The finite-time control facilitates the motion planning by pre-calculating the coverage time horizon, which can avoid the unnecessary time-wasting. The bipartite control can realize the collision avoidance for the multi-vehicle system unless the reference state is zero. Therefore, a finite-time bipartite (FTB) control method for the UAV-USV system based on MPC is proposed. However, in most of the existing UAV-USV systems, the sustainable operating mechanism is not considered as usual. In this article, to realize the sustainable operating mechanism, an UAV-USV ecosystem is introduced and further improved. The recursive feasibility and asymptotic stability of the proposed method are proven.

A Real-Time Degeneracy Sensing and Compensation Method for Enhanced LiDAR SLAM

Z. Liao, X. Zhang, T. Zhang, Z. Li, Z. Zheng, Z. Wen, and Y. Li

LiDAR is widely used in simultaneous localization and mapping (SLAM) and autonomous driving. The LiDAR odometry is of great importance in multi-sensor fusion. However, in some unstructured environments, the point cloud registration cannot constrain the poses of the LiDAR due to its sparse geometric features, which leads to the degeneracy of multi-sensor fusion accuracy. To address this problem, the authors propose a novel real-time approach to sense and compensate for the degeneracy of LiDAR. First, this article introduces the degeneracy factor with clear meaning, which can measure the degeneracy of LiDAR. Then, the density-based spatial clustering of applications with noise (DBSCAN) clustering method adaptively perceives the degeneracy with better environmental generalization. Finally, the degeneracy perception results are utilized to fuse LiDAR and IMU, thus effectively resisting degeneracy effects. Experiments on the dataset show the method’s high accuracy and robustness and validate the algorithm’s adaptability to different environments and LiDAR scanning modalities.

An Explainable Q-Learning Method for Longitudinal Control of Autonomous Vehicles

M. Li, Z. Cui, Y. Wang, Y. Huang, and H. Chen

A novel explainable Q-learning method is proposed for the longitudinal control of autonomous vehicles to address the black box issue of AI in AVs. The method combines a deep Q-network with a Shapley additive explanation (SHAP) to provide transparent decision-making processes. A positive SHAP approach is introduced to ensure the overall contribution of state features remains positive, enhancing the reliability of explanations. The effectiveness of the algorithm is demonstrated through numerical simulations, offering a step towards gaining user trust in AI-driven vehicles.

MIT Libraries

MIT Libraries

Scanning the Issue

Metadata

ISSN Information:

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

Scanning the Issue

Alerts

Metadata

ISSN Information: