Augmented Reality is a technique that allows users to overlap digital information with their physical world.The Augmented Reality(AR)displays have an exceptional characteristic from the Human–Computer Interaction(HCI)perspective.Due to its increasing popularity and application in diverse domains,increasing user-friendliness and AR usage are critical.Context-aware is one approach since an AR application can adapt to the user,environment,needs and enhance ergonomic principles and functionality.This paper proposes the Intelligent Contextaware Augmented Reality Model(ICAARM)for Human–Computer Interaction systems.This study explores and reduces interaction uncertainty by semantically modeling user-specific interaction with context,allowing personalised interaction.Sensory information is captured from an AR device to understand user interactions and context.These depictions carry semantics to Augmented Reality applications about the user’s intention to interact with a specific device affordance.Thus,this study describes personalised gesture interaction in VR/AR applications for immersive/intelligent environments.
The integration of Unmanned Aerial Vehicles(UAVs)into Intelligent Transportation Systems(ITS)holds trans-formative potential for real-time traffic monitoring,a critical component of emerging smart city infrastructure.UAVs offer unique advantages over stationary traffic cameras,including greater flexibility in monitoring large and dynamic urban areas.However,detecting small,densely packed vehicles in UAV imagery remains a significant challenge due to occlusion,variations in lighting,and the complexity of urban landscapes.Conventional models often struggle with these issues,leading to inaccurate detections and reduced performance in practical applications.To address these challenges,this paper introduces CFEMNet,an advanced deep learning model specifically designed for high-precision vehicle detection in complex urban environments.CFEMNet is built on the High-Resolution Network(HRNet)architecture and integrates a Context-aware Feature Extraction Module(CFEM),which combines multi-scale feature learning with a novel Self-Attention and Convolution layer setup within a Multi-scale Feature Block(MFB).This combination allows CFEMNet to accurately capture fine-grained details across varying scales,crucial for detecting small or partially occluded vehicles.Furthermore,the model incorporates an Equivalent Feed-Forward Network(EFFN)Block to ensure robust extraction of both spatial and semantic features,enhancing its ability to distinguish vehicles from similar objects.To optimize computational efficiency,CFEMNet employs a local window adaptation of Multi-head Self-Attention(MSA),which reduces memory overhead without sacrificing detection accuracy.Extensive experimental evaluations on the UAVDT and VisDrone-DET2018 datasets confirm CFEMNet’s superior performance in vehicle detection compared to existing models.This new architecture establishes CFEMNet as a benchmark for UAV-enabled traffic management,offering enhanced precision,reduced computational demands,and scalability for deployment in smart city applications.The advan
Yahia SaidYahya AlassafTaoufik SaidaniRefka GhodhbaniOlfa Ben RhaiemAli Ahmad Alalawi
The hands and face are the most important parts for expressing sign language morphemes in sign language videos.However,we find that existing Continuous Sign Language Recognition(CSLR)methods lack the mining of hand and face information in visual backbones or use expensive and time-consuming external extractors to explore this information.In addition,the signs have different lengths,whereas previous CSLR methods typically use a fixed-length window to segment the video to capture sequential features and then perform global temporal modeling,which disturbs the perception of complete signs.In this study,we propose a Multi-Scale Context-Aware network(MSCA-Net)to solve the aforementioned problems.Our MSCA-Net contains two main modules:(1)Multi-Scale Motion Attention(MSMA),which uses the differences among frames to perceive information of the hands and face in multiple spatial scales,replacing the heavy feature extractors;and(2)Multi-Scale Temporal Modeling(MSTM),which explores crucial temporal information in the sign language video from different temporal scales.We conduct extensive experiments using three widely used sign language datasets,i.e.,RWTH-PHOENIX-Weather-2014,RWTH-PHOENIX-Weather-2014T,and CSL-Daily.The proposed MSCA-Net achieve state-of-the-art performance,demonstrating the effectiveness of our approach.
Time series data plays a crucial role in intelligent transportation systems.Traffic flow forecasting represents a precise estimation of future traffic flow within a specific region and time interval.Existing approaches,including sequence periodic,regression,and deep learning models,have shown promising results in short-term series forecasting.However,forecasting scenarios specifically focused on holiday traffic flow present unique challenges,such as distinct traffic patterns during vacations and the increased demand for long-term forecastings.Consequently,the effectiveness of existing methods diminishes in such scenarios.Therefore,we propose a novel longterm forecasting model based on scene matching and embedding fusion representation to forecast long-term holiday traffic flow.Our model comprises three components:the similar scene matching module,responsible for extracting Similar Scene Features;the long-short term representation fusion module,which integrates scenario embeddings;and a simple fully connected layer at the head for making the final forecasting.Experimental results on real datasets demonstrate that our model outperforms other methods,particularly in medium and long-term forecasting scenarios.