Efficient Knowledge Transfer in Multi-Task Learning through Task-Adaptive Low-Rank Representation Xiao Zhang, Kangsheng Wang, Tianyu Hu, Huimin Ma Proceedings IEEE International Conference on Multimedia and Expo, 2025 Pre-trained language models (PLMs) demonstrate remarkable intelligence but struggle with emerging tasks unseen during training in real-world applications. Training separate models for each new task is usually impractical. Multi-task learning (MTL) addresses this challenge by transferring shared knowledge from source tasks to target tasks. As an dominant parameter-efficient fine-tuning method, prompt tuning (PT) enhances MTL by introducing an adaptable vector that captures task-specific knowledge, which acts as a prefix to the original prompt that preserves shared knowledge, while keeping PLM parameters frozen. However, PT struggles to effectively capture the heterogeneity of task-specific knowledge due to its limited representational capacity. To address this challenge, we propose Task-Adaptive Low-Rank Representation (TA-LoRA), an MTL method built on PT, employing the low-rank representation to model task heterogeneity and a fast-slow weights mechanism where the slow weight encodes shared knowledge, while the fast weight captures task-specific nuances, avoiding the mixing of shared and task-specific knowledge, caused by training low-rank representations from scratch. Moreover, a zero-initialized attention mechanism is introduced to minimize the disruption of immature low-rank components on original prompts during warm-up epochs. Experiments on 16 tasks demonstrate that TA-LoRA achieves state-of-the-art performance in full-data and few-shot settings while maintaining superior parameter efficiency.<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup>
Enhancing Autonomous Driving through Dual-Process Learning with Behavior and Reflection Integration Xiao Zhang, Kangsheng Wang, Tianyu Hu, Huimin Ma ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, 2025 Contemporary autonomous driving (AD) methodologies, which predominantly convert visual features into control directives, face long-tail challenges due to constraints imposed by limited data distribution. Conversely, human drivers exhibit proficiency in such conditions, underscoring the significance of emulating human cognition in AD systems. Therefore, we introduce Dual-Process Learning (D-PL) approach for cognitive-enhanced decision-making. Inspired by dual-process theory, the D-PL method combines Behavior Pattern Learning (BPL) and Self-Reflective Learning (SRL) to integrate quick, intuitive decisions with deliberate, analytical reasoning, constructing a hierarchical decision model for sophisticated trajectory planning. Our approach improves decision-making, enhances adaptability, and tackles the crucial open-world generalization challenge encountered by current AD methods. Comprehensive evaluations on the nuScenes dataset validate the robustness of our method, demonstrating its superior performance in navigating the intricacies of real-world contrasting with conventional models.
CSCE: Boosting LLM Reasoning by Simultaneous Enhancing of Causal Significance and Consistency Kangsheng Wang, Xiao Zhang, Juntao Lyu, Tianyu Hu, Huimin Ma Proceedings IEEE International Conference on Multimedia and Expo, 2025 Chain-based reasoning methods like chain of thought (CoT) play a rising role in solving reasoning tasks for large language models (LLMs). However, the causal hallucinations between a step of reasoning and corresponding state transitions are becoming a significant obstacle to advancing LLMs’ reasoning capabilities, especially in long-range reasoning tasks. This paper proposes a non-chain-based reasoning framework for simultaneous consideration of causal significance and consistency, i.e., the Causal Significance and Consistency Enhancer (CSCE). We customize LLM’s loss function utilizing treatment effect assessments to enhance its reasoning ability from two aspects: causal significance and consistency. This ensures that the model captures essential causal relationships and maintains robust and consistent performance across various scenarios. Additionally, we transform the reasoning process from the cascading multiple one-step reasoning commonly used in Chain-Based methods, like CoT, to a causal-enhanced method that outputs the entire reasoning process in one go, further improving the model’s reasoning efficiency. Extensive experiments show that our method improves both the reasoning success rate and speed. These improvements further demonstrate that non-chain-based methods can also aid LLMs in completing reasoning tasks.
Non-contact Vital Signs Detection in Dynamic Environments Shuai Sun, Chong-Xi Liang, Chengwei Ye, Huanzhen Zhang, Kangsheng Wang 2025 4th International Symposium on Computer Applications and Information Technology Iscait 2025, 2025 Accurate phase demodulation is essential for vital sign detection using millimeter wave radar. The time-varying DC offsets and phase imbalance in complex scenarios can seriously interfere with the performance of demodulation. This letter proposes a novel DC offset calibration algorithm as well as a Hilbert and differential cross-multiply (HADCM) demodulation algorithm to solve the time-varying imbalance terms. It works by estimating the time-varying DC offsets from neighboring peaks and valleys, and uses the differential form as well as the Hilbert transform of the $I / Q$ channel signals to obtain the vital sign signal. Simulations and experiments have verified the effectiveness of the novel algorithm under low signal-to-noise ratio. Compared with the existing demodulation algorithms, the proposed algorithm can not only recover the original signal in complex environments more accurately, but also reduce the interference of noise on the signal.
DRCO: a Toolkit for Intelligently Curbing Illegal Wildlife Trade Songcheng Xu, Yuhan Ye, Mingyuan Li, Haochen You, Kangsheng Wang, Wenxin Zhang Proceedings of the International Joint Conference on Neural Networks, 2025 Although generative AI has been applied to protect wildlife in various scenarios, studies have identified the lack of an integrated application toolkit to curb illegal wildlife trade. We therefore introduce DRCO <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup>, the first application framework that is integrated with LLM by proposing useful policies. In the Decision-Making Module, a black-box LLM, combined with a refined ReAct-like prompting template, selects policy promoters in one region. In the Restrictive-Partial-Legalization Module, our innovative Dynamic Iterative Constraint Method is employed to calculate the controlled volume of wildlife trade under the influence of policy, which may provide an idea for Explainable AI research. The Curve-fitting module fits current and future data into a curve with an 86.27% fit to the original Product Life Cycle Curve. The Optimal-Algorithm Module proposes a Dynamic WP-CUCB Algorithm with Policy Consideration to optimize the allocation of wildlife patrol resources in the region. Experiments indicate that policies generated by DRCO can lead to significant control and promote "AI for social good". Our code and more details will be open-sourced online.
Synergistic Spotting and Recognition of Micro-Expression via Temporal State Transition Bochao Zou, Zizheng Guo, Wenfeng Qin, Xin Li, Kangsheng Wang, Huimin Ma ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, 2025 Micro-expressions are involuntary facial movements that cannot be consciously controlled, conveying subtle cues with substantial real-world applications. The analysis of micro-expressions generally involves two main tasks: spotting micro-expression intervals in long videos and recognizing the emotions associated with these intervals. Previous deep-learning methods have primarily relied on classification networks utilizing sliding windows. However, fixed window sizes and window-level hard classification introduce numerous constraints. Additionally, these methods have not fully exploited the potential of complementary pathways for spotting and recognition. In this paper, we present a novel temporal state transition architecture grounded in the state space model, which replaces conventional window-level classification with video-level regression. Furthermore, by leveraging the inherent connections between spotting and recognition tasks, we propose a synergistic strategy that enhances overall analysis performance. Extensive experiments demonstrate that our method achieves state-of-the-art performance. The codes are available at https://github.com/zizheng-guo/ME-TST.
MedConv: Convolutions Beat Transformers on Long-Tailed Bone Density Prediction Xuyin Qi, C Zeyu Zhang, Huazhan Zheng, Mingxi Chen, Numan Kutaiba, Ruth Lim, Cherie Chiang, Zi En Tham, Xuan Ren, Wenxin Zhang, Lei Zhang, Hao Zhang, Wenbing Lv, Guangzhen Yao, Renda Han, Kangsheng Wang, Mingyuan Li, Hongtao Mao, Yu Li, Zhibin Liao, Yang Zhao, Minh-Son To Proceedings of the International Joint Conference on Neural Networks, 2025
ByteGraph: A High-Performance Distributed Graph Database in ByteDance Changji Li, Hongzhi Chen, Shuai Zhang, Yingqian Hu, Chao Chen, Zhenjie Zhang, Meng Li, Xiangchen Li, Dongqing Han, Xiaohui Chen, Xudong Wang, Huiming Zhu, Xuwei Fu, Tingwei Wu, Hongfei Tan, Hengtian Ding, Mengjin Liu, Kangcheng Wang, Ting Ye, Lei Li, Xin Li, Yu Wang, Chenguang Zheng, Hao Yang, James Cheng Proceedings of the VLDB Endowment, 2022