Han Qiu

Scopus Publications

Incremental Learning, Incremental Backdoor Threats
Wenbo Jiang, Tianwei Zhang, Han Qiu, Hongwei Li, and Guowen Xu
Institute of Electrical and Electronics Engineers (IEEE)

An Efficient Preprocessing-based Approach to Mitigate Advanced Adversarial Attacks
Han Qiu, Yi Zeng, Qinkai Zheng, Shangwei Guo, Tianwei Zhang, and Hewu Li
Institute of Electrical and Electronics Engineers (IEEE)

Automatic Transformation Search Against Deep Leakage from Gradients
Wei Gao, Xu Zhang, Shangwei Guo, Tianwei Zhang, Tao Xiang, Han Qiu, Yonggang Wen, and Yang Liu
Institute of Electrical and Electronics Engineers (IEEE)
Collaborative learning has gained great popularity due to its benefit of data privacy protection: participants can jointly train a Deep Learning model without sharing their training sets. However, recent works discovered that an adversary can fully recover the sensitive training samples from the shared gradients. Such reconstruction attacks pose severe threats to collaborative learning. Hence, effective mitigation solutions are urgently desired. In this paper, we systematically analyze existing reconstruction attacks and propose to leverage data augmentation to defeat these attacks: by preprocessing sensitive images with carefully-selected transformation policies, it becomes infeasible for the adversary to extract training samples from the corresponding gradients. We first design two new metrics to quantify the impacts of transformations on data privacy and model usability. With the two metrics, we design a novel search method to automatically discover qualified policies from a given data augmentation library. Our defense method can be further combined with existing collaborative training systems without modifying the training protocols. We conduct comprehensive experiments on various system settings. Evaluation results demonstrate that the policies discovered by our method can defeat state-of-the-art reconstruction attacks in collaborative learning, with high efficiency and negligible impact on the model performance.

System Log Parsing: A Survey
Tianzhu Zhang, Han Qiu, Gabriele Castellano, Myriana Rifai, Chung Shue Chen, and Fabio Pianese
Institute of Electrical and Electronics Engineers (IEEE)
Modern information and communication systems have become increasingly challenging to manage. The ubiquitous system logs contain plentiful information and are thus widely exploited as an alternative source for system management. As log files usually encompass large amounts of raw data, manually analyzing them is laborious and error-prone. Consequently, many research endeavors have been devoted to automatic log analysis. However, these works typically expect structured input and struggle with the heterogeneous nature of raw system logs. Log parsing closes this gap by converting the unstructured system logs to structured records. Many parsers were proposed during the last decades to accommodate various log analysis applications. However, due to the ample solution space and lack of systematic evaluation, it is not easy for practitioners to find ready-made solutions that fit their needs. This paper aims to provide a comprehensive survey on log parsing. We begin with an exhaustive taxonomy of existing log parsers. Then we empirically analyze the critical performance and operational features for 17 open-source solutions both quantitatively and qualitatively, and whenever applicable discuss the merits of alternative approaches. We also elaborate on future challenges and discuss the relevant research directions. We envision this survey as a helpful resource for system administrators and domain experts to choose the most desirable open-source solution or implement new ones based on application-specific requirements.

A Networking Perspective on Starlink's Self-Driving LEO Mega-Constellation
Yuanjie Li, Hewu Li, Wei Liu, Lixin Liu, Wei Zhao, Yimei Chen, Jianping Wu, Qian Wu, Jun Liu, Zeqi Lai,et al.
ACM
Low-earth-orbit (LEO) satellite mega-constellations, such as SpaceX Starlink, are under rocket-fast deployments and promise broadband Internet to remote areas that terrestrial networks cannot reach. For mission safety and sustainable uses of space, Starlink has adopted a proprietary onboard autonomous driving system for its extremely mobile LEO satellites. This paper demystifies and diagnoses its impacts on the LEO mega-constellation and satellite networks. We design a domain-specific method to characterize key components in Starlink's autonomous driving from various public space situational awareness datasets, including continuous orbit maintenance, collision avoidance, and maneuvers between orbital shells. Our analysis shows that, these operations have mixed impacts on the stability and performance of the entire mega-constellation, inter-satellite links, topology, and upper-layer network functions. To this end, we investigate and empirically assess the potential of networking-autonomous driving co-designs for the upcoming satellite networks.

Wangiri Fraud: Pattern Analysis and Machine-Learning-Based Detection
Akshaya Ravi, Mounira Msahli, Han Qiu, Gerard Memmi, Albert Bifet, and Meikang Qiu
Institute of Electrical and Electronics Engineers (IEEE)
The rapid growth of the telecommunication landscape leads to a rapid rise of frauds in such networks. In this article, Wangiri fraud in which users are deceived by being charged for services without their knowledge during a call is tackled. In fact, Wangiri fraud has significant negative financial and reputation consequences for the mobile service providers and also has a bad psychological impact on the victims. In order to identify this fraudulent behavior, three Wangiri fraud patterns are defined by analyzing call records of over a year. Then, the security and performance of unsupervised and supervised machine learning (ML) methods in detecting one Wangiri pattern are evaluated using a large real-world Call Detail Records (CDRs) data set. In the context of Wangiri fraud detection, classification algorithms outperformed the others based on the chosen security and performance metrics. Finally, the performance evaluation of these algorithms is extended in detecting the other two real-world Wangiri fraud patterns. This article provides a detailed definition of the Wangiri fraud patterns and outlines the implementation and evaluation of ML algorithms in the context of detecting Wangiri fraud. The security analysis and experimental results demonstrate that depending on fraud patterns the best ML algorithm to detect Wangiri fraud may also vary.

DefQ: Defensive Quantization against Inference Slow-down Attack for Edge Computing
Han Qiu, Tianwei Zhang, Tianzhu Zhang, Hongyu Li, and Meikang Qiu
Institute of Electrical and Electronics Engineers (IEEE)

Computation and Data Efficient Backdoor Attacks
Yutong Wu, Xingshuo Han, Han Qiu, and Tianwei Zhang
IEEE
Backdoor attacks against deep neural network (DNN) models have been widely studied. Various attack techniques have been proposed for different domains and paradigms, e.g., image, point cloud, natural language processing, transfer learning, etc. The most widely-used way to embed a backdoor into a DNN model is to poison the training data. They usually randomly select samples from the benign training set for poisoning, without considering the distinct contribution of each sample to the backdoor effectiveness, making the attack less optimal.A recent work [40] proposed to use the forgetting score to measure the importance of each poisoned sample and then filter out redundant data for effective backdoor training. However, this method is empirically designed without theoretical proofing. It is also very time-consuming as it needs to go through several training stages for data selection. To address such limitations, we propose a novel confidence-based scoring methodology, which can efficiently measure the contribution of each poisoning sample based on the distance posteriors. We further introduce a greedy search algorithm to find the most informative samples for backdoor injection more promptly. Experimental evaluations on both 2D image and 3D point cloud classification tasks show that our approach can achieve comparable performance or even surpass the forgetting score-based searching method while requiring only several extra epochs’ computation of a standard training process. Our code can be found at https://github.com/WU-YU-TONG/computational_efficient_backdoor

ATTA: Adversarial Task-transferable Attacks on Autonomous Driving Systems
Qingjie Zhang, Maosen Zhang, Han Qiu, Tianwei Zhang, Mounira Msahli, and Gerard Memmi
IEEE
Deep learning (DL) based perception models have enabled the possibility of current autonomous driving systems (ADS). However, various studies have pointed out that the DL models inside the ADS perception modules are vulnerable to adversarial attacks which can easily manipulate these DL models’ predictions. In this paper, we propose a more practical adversarial attack against the ADS perception module. Particularly, instead of targeting one of the DL models inside the ADS perception module, we propose to use one universal patch to mislead multiple DL models inside the ADS perception module simultaneously which leads to a higher chance of system-wide malfunction. We achieve such a goal by attacking the attention of DL models as a higher level of feature representation rather than traditional gradient-based attacks. We successfully generate a universal patch containing malicious perturbations that can attract multiple victim DL models’ attention to further induce their prediction errors. We verify our attack with extensive experiments on a typical ADS perception module structure with five famous datasets and also physical world scenes1.1We release our code at https://github.com/qingjiesjtu/ATTA

MERCURY: An Automated Remote Side-channel Attack to Nvidia Deep Learning Accelerator
Xiaobei Yan, Xiaoxuan Lou, Guowen Xu, Han Qiu, Shangwei Guo, Chip Hong Chang, and Tianwei Zhang
IEEE
DNN accelerators have been widely deployed in many scenarios to speed up the inference process and reduce the energy consumption. One big concern about the usage of the accelerators is the confidentiality of the deployed models: model inference execution on the accelerators could leak side-channel information, which enables an adversary to preciously recover the model details. Such model extraction attacks can not only compromise the intellectual property of DNN models, but also facilitate some adversarial attacks. Although previous works have demonstrated a number of side-channel techniques to extract models from DNN accelerators, they are not practical for two reasons. (1) They only target simplified accelerator implementations, which have limited practicality in the real world. (2) They require heavy human analysis and domain knowledge. To overcome these limitations, this paper presents MERCURY, the first automated remote side-channel attack against the off-the-shelf Nvidia DNN accelerator. The key insight of MERCURY is to model the side-channel extraction process as a sequence-to-sequence problem. The adversary can leverage a time-to-digital converter (TDC) to remotely collect the power trace of the target model’s inference. Then he uses a learning model to automatically recover the architecture details of the victim model from the power trace without any prior knowledge. The adversary can further use the attention mechanism to localize the leakage points that contribute most to the attack. Evaluation results indicate that MERCURY can keep the error rate of model extraction below 1%.

Aegis: Mitigating Targeted Bit-flip Attacks against Deep Neural Networks

One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training
Jianshuo Dong, Han Qiu, Yiming Li, Tianwei Zhang, Yuanjie Li, Zeqi Lai, Chao Zhang, and Shu-Tao Xia
IEEE
Deep neural networks (DNNs) are widely deployed on real-world devices. Concerns regarding their security have gained great attention from researchers. Recently, a new weight modification attack called bit flip attack (BFA) was proposed, which exploits memory fault inject techniques such as row hammer to attack quantized models in the deployment stage. With only a few bit flips, the target model can be rendered useless as a random guesser or even be implanted with malicious functionalities. In this work, we seek to further reduce the number of bit flips. We propose a training-assisted bit flip attack, in which the adversary is involved in the training stage to build a high-risk model to release. This high-risk model, obtained coupled with a corresponding malicious model, behaves normally and can escape various detection methods. The results on benchmark datasets show that an adversary can easily convert this high-risk but normal model to a malicious one on victim’s side by flipping only one critical bit on average in the deployment stage. Moreover, our attack still poses a significant threat even when defenses are employed. The codes for reproducing main experiments are available at https://github.com/jianshuod/TBA.

MPass: Bypassing Learning-based Static Malware Detectors
Jialai Wang, Wenjie Qu, Yi Rong, Han Qiu, Qi Li, Zongpeng Li, and Chao Zhang
IEEE
Machine learning (ML) based static malware detectors are widely deployed, but vulnerable to adversarial attacks. Unlike images or texts, tiny modifications to malware samples would significantly compromise their functionality. Consequently, existing attacks against images or texts will be significantly restricted when being deployed on malware detectors. In this work, we propose a hard-label black-box attack MPass against ML-based detectors. MPass employs a problem-space explainability method to locate critical positions of malware, applies adversarial modifications to such positions, and utilizes a runtime recovery technique to preserve the functionality. Experiments show MPass outperforms existing solutions and bypasses both state-of-the-art offline models and commercial ML-based antivirus products.

Mind Your Heart: Stealthy Backdoor Attack on Dynamic Deep Neural Network in Edge Computing
Tian Dong, Ziyuan Zhang, Han Qiu, Tianwei Zhang, Hewu Li, and Terry Wang
IEEE
Transforming off-the-shelf deep neural network (DNN) models into dynamic multi-exit architectures can achieve inference and transmission efficiency by fragmenting and distributing a large DNN model in edge computing scenarios (e.g., edge devices and cloud servers). In this paper, we propose a novel backdoor attack specifically on the dynamic multi-exit DNN models. Particularly, we inject a backdoor by poisoning one DNN model’s shallow hidden layers targeting not this vanilla DNN model but only its dynamically deployed multi-exit architectures. Our backdoored vanilla model behaves normally on performance and cannot be activated even with the correct trigger. However, the backdoor will be activated when the victims acquire this model and transform it into a dynamic multi-exit architecture at their deployment. We conduct extensive experiments to prove the effectiveness of our attack on three structures (ResNet-56, VGG-16, and MobileNet) with four datasets (CIFAR-10, SVHN, GTSRB, and Tiny-ImageNet) and our backdoor is stealthy to evade multiple state-of-the-art backdoor detection or removal methods.

Public-attention-based Adversarial Attack on Traffic Sign Recognition
Lijun Chi, Mounira Msahli, Gerard Memmi, and Han Qiu
IEEE
Autonomous driving systems (ADS) can instantaneously and accurately recognize traffic signs by using deep neural networks (DNNs). Although adversarial attacks are well-known to easily fool DNNs by adding tiny but malicious perturbations, most attack methods require sufficient information about the victim models (white-box) to perform. In this paper, we propose a black-box attack in the recognition system of ADS, Public Attention Attacks (PAA), that can attack a black-box model by collecting the generic attention patterns of other white-box DNNs to transfer the attack. Particularly, we select multiple dual or triple attention patterns of white-box model combinations to generate the transferable adversarial perturbations for PAA attacks. We perform the experimentation on four well-trained models in different adversarial settings separately. The results indicate that when more white-box models the adversary collects to perform PAA, the higher the attack success rate (ASR) he can achieve to attack the target black-box model.

ADS-Lead: Lifelong Anomaly Detection in Autonomous Driving Systems
Xingshuo Han, Yuan Zhou, Kangjie Chen, Han Qiu, Meikang Qiu, Yang Liu, and Tianwei Zhang
Institute of Electrical and Electronics Engineers (IEEE)
Autonomous Vehicles (AVs) are closely connected in the Cooperative Intelligent Transportation System (C-ITS). They are equipped with various sensors and controlled by Autonomous Driving Systems (ADSs) to provide high-level autonomy. The vehicles exchange different types of real-time data with each other, which can help reduce traffic accidents and congestion, and improve the efficiency of transportation systems. However, when interacting with the environment, AVs suffer from a broad attack surface, and the sensory data are susceptible to anomalies caused by faults, sensor malfunctions, or attacks, which may jeopardize traffic safety and result in serious accidents. In this paper, we propose ADS-Lead, an efficient collaborative anomaly detection methodology to protect the lane-following mechanism of ADSs. ADS-Lead is equipped with a novel transformer-based one-class classification model to identify time series anomalies (GPS spoofing threat) and adversarial image examples (traffic sign and lane recognition attacks). Besides, AVs inside the C-ITS form a cognitive network, enabling us to apply the federated learning technology to our anomaly detection method, where the vehicles in the C-ITS jointly update the detection model with higher model generalization and data privacy. Experiments on Baidu Apollo and two public data sets (GTSRB and Tumsimple) indicate that our method can not only detect sensor anomalies effectively and efficiently but also outperform state-of-the-art anomaly detection methods.

BET: Black-box efficient testing for convolutional neural networks
Jialai Wang, Han Qiu, Yi Rong, Hengkai Ye, Qi Li, Zongpeng Li, and Chao Zhang
ACM
It is important to test convolutional neural networks (CNNs) to identify defects (e.g. error-inducing inputs) before deploying them in security-sensitive scenarios. Although existing white-box testing methods can effectively test CNN models with high neuron coverage, they are not applicable to privacy-sensitive scenarios where full knowledge of target CNN models is lacking. In this work, we propose a novel Black-box Efficient Testing (BET) method for CNN models. The core insight of BET is that CNNs are generally prone to be affected by continuous perturbations. Thus, by generating such continuous perturbations in a black-box manner, we design a tunable objective function to guide our testing process for thoroughly exploring defects in different decision boundaries of the target CNN models. We further design an efficiency-centric policy to find more error-inducing inputs within a fixed query budget. We conduct extensive evaluations with three well-known datasets and five popular CNN structures. The results show that BET significantly outperforms existing white-box and black-box testing methods considering the effective error-inducing inputs found in a fixed query/inference budget. We further show that the error-inducing inputs found by BET can be used to fine-tune the target model, improving its accuracy by up to 3%.

JTrans: Jump-aware transformer for binary code similarity detection
Hao Wang, Wenjie Qu, Gilad Katz, Wenyu Zhu, Zeyu Gao, Han Qiu, Jianwei Zhuge, and Chao Zhang
ACM

Interpreting AI for Networking: Where We Are and Where We Are Going
Tianzhu Zhang, Han Qiu, Marco Mellia, Yuanjie Li, Hewu Li, and Ke Xu
Institute of Electrical and Electronics Engineers (IEEE)
In recent years, artificial intelligence (AI) techniques have been increasingly adopted to tackle networking problems. Although AI algorithms can deliver high-quality solutions, most of them are inherently intricate and erratic for human cognition. This lack of interpretability tremendously hinders the commercial success of AI-based solutions in practice. To cope with this challenge, networking researchers are starting to explore explainable AI (XAI) techniques to make AI models interpretable, manageable, and trustworthy. In this article, we overview the application of AI in networking and discuss the necessity for interpretability. Next, we review the current research on interpreting AI-based networking solutions and systems. At last, we envision future challenges and directions. The ultimate goal of this article is to present a general guideline for AI and networking practitioners and motivate the continuous advancement of AI-based solutions in modern communication networks.

An MRC Framework for Semantic Role Labeling

IMPROVED DC ESTIMATION FOR JPEG COMPRESSION VIA CONVEX RELAXATION
Jianghui Zhang, Bin Chen, Yujun Huang, Han Qiu, Zhi Wang, and Shutao Xia
IEEE
Mass image transmission has undergone an explosion of growth with the development of the internet, DCT-based lossy image compression like JPEG is pervasively conducted to save the transmission bandwidth. Recently, DCT-domain coefficient estimation approaches have been proposed to further improve the compression ratio by discarding DC coefficients at the sender’s end while recovering them at the receiver’s end via DC estimation. However, known DC estimation needs to enumerate all possible DC coefficients. Consequently, they are limited and resource-consuming due to the low delay requirements in real-time transmission. In this paper, we propose an improved DC estimation method via convex relaxation, which achieves state-of-the-art performance in terms of both recovery image quality and time complexity. Extensive experiments across various data sets demonstrate the advantages of our method.

Watermarking Pre-trained Encoders in Contrastive Learning
Yutong Wu, Han Qiu, Tianwei Zhang, Jiwei Li, and Meikang Qiu
IEEE
Contrastive learning has become a popular technique to pre-train image encoders, which could be used to build various downstream classification models in an efficient way. This process requires a large amount of data and computation resources. Hence, the pre-trained encoders are an important intellectual property that needs to be carefully protected. It is challenging to migrate existing watermarking techniques from the classification tasks to the contrastive learning scenario, as the owner of the encoder lacks the knowledge of the downstream tasks which will be developed from the encoder in the future. We propose the first watermarking methodology for the pre-trained encoders. We introduce a task-agnostic loss function to effectively embed into the encoder a backdoor as the watermark. This backdoor can still exist in any downstream models transferred from the encoder. Extensive evaluations over different contrastive learning algorithms, datasets, and downstream tasks indicate our watermarks exhibit high effectiveness and robustness against different adversarial operations.

Improving Adversarial Robustness of 3D Point Cloud Classification Models
Guanlin Li, Guowen Xu, Han Qiu, Ruan He, Jiwei Li, and Tianwei Zhang
Springer Nature Switzerland

Research and Technical Writing for Science and Engineering
Meikang Qiu, Han Qiu, and Yi Zeng
CRC Press

Mitigating Targeted Bit-Flip Attacks via Data Augmentation: An Empirical Study
Ziyuan Zhang, Meiqi Wang, Wencheng Chen, Han Qiu, and Meikang Qiu
Springer International Publishing