TIFS2026

Abstract:
More and more sensitive data is made online, and even when the data is encrypted on the server, adversaries who compromise the server can obtain the decryption key and thus decrypt the data, because the key is generally stored on the same server as the data. Password Hardening (PH) encryption introduces an additional security layer by incorporating an external PH server to restrict unauthorized decryption. However, leading PH encryption schemes (at USENIX SEC’18 and ACM CCS’20) suffer from substantial encryption/decryption inefficiency, making them unsuitable for large-scale data processing. Additionally, these schemes still have privacy shortcomings, as the external PH server can infer user habits by learning authentication results. For the first time, we propose a brand-new PH encryption scheme named HPHE and a hash-based puncturable pseudorandom function, which together form a hybrid PH encryption architecture. The architecture is extensible to other PH schemes and avoids key reuse by deriving high-entropy keys to achieve one-data-one-key. Compared with non-hybrid original PH schemes, HPHE achieves at least a 61% improvement in the efficiency of interactive PH encryption/decryption. In one-data-one-key scenarios, HPHE achieves approximately 450 times higher encryption/decryption efficiency than original PHE (USENIX SEC’18) by replacing multi-round interaction with key derivation. Additionally, HPHE achieves irrecoverable secure deletion and access restrictions by puncturing keys. For the first time, the novel construction of our HPHE achieves the Hiding of password verification results in PH. In addition, we formally define the Privacy security attributes in PH encryption and show that HPHE satisfies the strongest security guarantees. This work extends PH encryption to efficient large-scale data processing and more comprehensive privacy protection.

Abstract:
Single-positive multi-label learning (SPMLL) aims to train a multi-label classifier from data with single-positive label, to predict all applicable labels during testing. However, existing SPMLL methods are tailored for centralized datasets, which fail to be directly deployed to distributed setting like federated learning. In this paper, we start the first attempt to study federated single-positive multi-label learning (FedSPMLL), aiming to collaboratively train a SPMLL model from distributed data. To achieve this, we need to address challenges caused by label incompleteness: limited generalization ability of local model and overweighting contribution of client with local dataset suffering from severe label incompleteness. To this end, we propose a novel FedLOG method, guiding FedSPMLL with predicate LOGic-modeled label correlation. Enabling the informative knowledge extraction from limited data, we propose to model label correlation within local dataset using predicate logic. To alleviate false negative label issue, we propose to transfer confident label correlation knowledge to local model by self-distillation. To downweight the contribution of unreliable client owning dataset with severe label incompleteness, we propose a new measurement of label incompleteness to adjust client contribution for a fair aggregation. We establish a comprehensive FedSPMLL benchmark. And extensive experiments demonstrate the superiority of our FedLOG method.

Abstract:
In Ethereum, DevP2P is the fundamental network-layer protocol set that supports consensus mechanisms, transaction propagation and smart contract execution. Due to the importance of DevP2P, its bugs can be exploited by the attacker to cause security problems like denial of service, leading to property loss on Ethereum. However, existing blockchain testing approaches focus on the bug detection of consensus and application layers, causing many serious DevP2P bugs to be missed. In fact, detecting DevP2P bugs has some key challenges, including how to generate effective inputs and how to detect complex bugs. This paper designs D2PFuzz, the first network-layer differential fuzzing approach of bug detection for Ethereum. It consists of two key techniques: (1) a query-based fuzzing strategy that dynamically generates valid DevP2P messages according to network, chain and node state changes; and (2) a multi-node differential checking method that identifies important differences of DevP2P response messages from multiple nodes in the same blockchain to detect semantic bugs. We have evaluated D2PFuzz on five open-source and popular Ethereum node implementations, including Geth, Erigon, Reth, Besu and Nethermind. D2PFuzz in total finds 15 unique bugs, 12 of which are previously unknown. Compared to two state-of-the-art blockchain testing approaches including LOKI and Hive, D2PFuzz improves testing coverage by 3.7x and 21.6x, respectively, and finds 13 bugs missed by these approaches.

Affiliations: State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang, China; State Key Laboratory of Public Big Data, College of Big Data and Information Engineering, Guizhou University, Guiyang, China; Hebei Key Laboratory of Network and Information Security, Hebei Normal University, Shijiazhuang, China; College of Data Science and Information Engineering, Guizhou Minzu University, Guiyang, China; Fujian Provincial Key Laboratory of Network Security and Cryptology, College of Computer and Cyber Security, Fujian Normal University, Fuzhou, China; State Key Laboratory of Integrated Services Network, School of Cyber Engineering, Xidian University, Xi’an, China

Abstract:
Federated learning (FL) allows multiple distributed clients with local datasets to train a global model collaboratively. Due to the potential privacy risk of the training process, differential privacy (DP) is introduced into FL to protect clients’ sensitive information by perturbing the model updates. However, the probability density function of the Laplace mechanism has a long-tail effect, which may generate large noise to induce the model to deviate from the normal result. Moreover, as the cloud is not fully trusted, there is no guarantee that the server follows the aggregation protocol correctly. To address these issues, in this paper, we propose a secure rational delegation FL scheme, namely SRDFL, and analyze its protection and convergence performance. Specifically, we first utilize the zero-determinant strategy to construct a FL rational model. It delegates tasks to multiple servers and encourages them to perform correct aggregation. Then, we design a bounded DP protection mechanism to achieve a fixed universe of perturbation outputs in a threshold-constrained manner. Finally, based on Shamir’s secret sharing, we propose a trusted verification algorithm of DP to validate servers for correct aggregation. Detailed theoretical analysis and extensive performance evaluations demonstrate that our proposed scheme is effective. Compared to existing works, SRDFL is able to improve 2.72%–47.92% model accuracy.

Abstract:
The widespread adoption of online Large Language Models (LLMs) raises considerable privacy concerns, as prompts may inadvertently contain sensitive information exposed to LLM service providers. Limited by high computational costs, reduced response utility, and excessive system modifications, previous works based on local deployment, embedding perturbation, and homomorphic encryption are not feasible for online prompt-based LLM services. To address these issues, we introduce ProSan (Prompt Privacy Sanitizer), an end-to-end method for prompt privacy protection that generates prompts with task-irrelevant privacy removed, while preserving both utility and readability. It can also be seamlessly integrated into the online LLM service pipeline. To achieve high utility and contextual privacy, ProSan flexibly adjusts its protection targets and strength based on the importance of the words and the privacy leakage risk of the prompts. Additionally, ProSan is capable of adapting to diverse computational resource conditions, ensuring privacy protection for low-resource users. Our experiments demonstrate that ProSan effectively removes sensitive information across various tasks, including question answering, text summarization, and code generation, with minimal reduction in task performance.

Abstract:
Quantization-conditioned backdoor attacks, which exploit model quantization states to trigger malicious behavior, pose a hidden threat to deep learning security. However, current studies ignore the feasibility of attacks, i.e., detection visibility, and computational overhead, leading to significant constraints on their practical deployment in real-world adversarial scenarios. To address these limitations, we propose Quantization-conditioned Efficient Stealthy Trojan (QuEST), a novel framework that enhances both the stealth and efficiency of backdoor attacks under quantization constraints. For enhancing stealthiness, we design a stealth-optimized training scheme that benefits from the parametric backdoor injection and trigger scaling augmentation to maintain the consistency of model behavior during attack. In this way, the defender will fail to capture the suspicious behavior differences for detection due to the made efforts in both model-side and data-side. To improve efficiency, we introduce information-guided parameter sharing, which utilizes parameter redundancy analysis and Fisher divergence metrics to identify a minimal amount of quantization-preserved parameters for backdoor injection. These parameters are strategically shared between the malicious and benign models, enabling concurrent training and substantially reducing overall training time. Extensive experiments demonstrate that QuEST maintains competitive attack success rates while improving stealth performance by 18.75% and reducing computational costs by 26.66% on average compared to state-of-the-art methods, highlighting QuEST’s potential for more practical adversarial deployments in real-world scenarios. Our code is available here.

Abstract:
As blockchain technology advances, an increasing number of applications require interactions between smart contracts across multiple blockchains. However, existing cross-chain solutions exhibit limited scalability due to heterogeneous blockchain environments and diverse application requirements. A fundamental challenge lies in the absence of a unified resource definition for cross-chain processes, impeding moderate resource allocation and effective conflict resolution. Specifically, when extended to general cross-chain transactions involving invocations among multiple contracts, these methods lack the capability to correctly handle state transitions for all related contracts. This paper proposes AtomXross, a novel cross-chain scheme that supports arbitrary combinations of smart contracts during the cross-chain process. We build a scalable cross-chain architecture based on a relay chain and a cluster of cross-chain nodes to provide better scalability. We propose a unified definition for cross-chain resources within the system and implement an adaptive resource management mechanism on the relay chain, enabling it to record the invocation relationships of contract functions. When a cross-chain transaction involves multiple contract calls, AtomXross can index the calls and generate the corresponding call tree. To address the challenges posed by potential mutual invocations between smart contracts, we design an atomic transaction protocol based on buckle-lock, an ordered two-tier pessimistic locking mechanism. AtomXross ensures that cross-chain transactions do not conflict with each other while remaining compatible with non-cross-chain calls that may occur at any time. Furthermore, we propose a universal programming template for on-chain smart contracts, which enables ordinary smart contracts to acquire cross-chain capabilities. We implement AtomXross based on Hyperledger Fabric and FiscoBCOS. In comparison to WeCross, AtomXross lowers the gas cost on system initialization and incurs only a 14% increase in transaction latency while supporting inter-contract calls.

Abstract:
Anonymous submissions inspire people to speak up since hiding their identities can protect them from negative influence by their own words. However, the abuse of anonymity may bring harassment to those public submission callers. Existing works only handle DoS attacks or block harassment senders in an active manner, which behave poorly in the early prevention of uncharacterized harassment. In this paper, we propose MsgFliter, a sender-anonymous messaging system with proactive anti-harassment mechanism. Our core idea is to prevent unanswered senders from sending messages continually while keeping their identities, messages, and sender types secret. To meet the functionality and security requirements of MsgFliter, we propose the Anti-Harassment Anonymous Authentication (AHAA) protocol. We associate messages from the same sender through linkable tags and invalidate linkability when a message is replied to. To achieve session indistinguishability, we further combine the proposed anonymous authentication with zero-knowledge proofs of disjunctive relations. We implement MsgFliter and compare its performance with related solutions. Experimental results show that our solution is practicable.

Abstract:
Thanks to the development of cross-modal models, text-to-video retrieval (T2VR) is advancing rapidly, but its robustness remains largely unexamined. Existing attacks against T2VR are designed to push videos away from queries, i.e., suppressing the ranks of videos, while the attacks that pull videos towards selected queries, i.e., promoting the ranks of videos, remain largely unexplored. These attacks can be more impactful as attackers may gain more views/clicks for financial benefits and widespread (mis)information. To this end, we pioneer the first attack against T2VR to promote videos adversarially, dubbed the Video Promotion attack (ViPro). We further propose Modal Refinement (MoRe) to capture the finer-grained, intricate interaction between visual and textual modalities and enhance black-box transferability. Comprehensive experiments cover 2 existing baselines, 3 leading T2VR models, 3 prevailing datasets with over 10k videos, evaluated under 3 scenarios. All experiments are conducted in a multi-target setting to reflect realistic scenarios where attackers seek to promote the video regarding multiple queries simultaneously. We also evaluated our attacks for defenses and imperceptibility. Overall, ViPro surpasses other baselines by over 30/10/4% for white/grey/black-box settings on average. Our work highlights an overlooked vulnerability, provides a qualitative analysis on the upper/lower bound of our attacks, and offers insights into potential counterplays. Code is available at https://github.com/michaeltian108/ViPro

Abstract:
In recent years, contrastive learning has made significant progress in DeepFake detection. However, existing methods emphasize class granularity, and it is difficult to distinguish between the real instance and its forgery counterparts effectively. Furthermore, the diversity of forgery cues produced by different manipulation methods cannot be effectively clustered by class granularity alone. Thus, the model’s generalization capability is limited. To tackle the above problems, a Dual-Granularity Contrastive Learning (DGCL) for DeepFake detection is proposed in this paper. Specifically, Class Granularity Contrastive Learning (CGCL) and Instance Granularity Contrastive Learning (IGCL) are designed. Firstly, for semantic aggregation at the class level, CGCL incorporates the class prototype, which encourages anchor approaches to the prototype of the positive class, thereby pulling the intra-class features closer. Secondly, for distinguishing between real and fake instances, Real Instance Granularity Contrastive Learning (RIGCL) and Fake Instance Granularity Contrastive Learning (FIGCL) are proposed based on the instance characteristics. RIGCL endeavors to distinguish fake instances from original real instances by expanding the differentiation in the feature space. Meanwhile, FIGCL extracts consistent forgery features from various manipulation methods using cosine similarity constraints. Finally, the superiority and generalizability of DGCL are validated by the experimental results on CELEBDF, DFD, and DFDC datasets.

Abstract:
The swift evolution of artificial intelligence and big data has dramatically increased data volume and computational complexity, thereby considerably escalating data storage and processing costs. As a result, dimensionality reduction has become an essential phase in data pre-processing. Moreover, due to the privacy concerns associated with data gathered by various organizations, data sharing is restricted, introducing further challenges in data analysis and the training of machine learning models. Consequently, we introduce an innovative privacy-preserving dimensionality reduction scheme (PP-DR). PP-DR secures participant data using homomorphic encryption and eschews intricate bootstrapping tasks by employing secure interaction protocols, thereby efficiently performing joint dimensionality reduction on data shared among all participants. In contrast to current dimension reduction approaches employing homomorphic encryption, PP-DR achieves superior computational accuracy with an average error of only 110^-8 . Additionally, it enhances computational efficiency by 30 to 200 times and reduces communication overhead by at least 70%. This study underscores the practical feasibility of secure multi-party collaborative dimension reduction using homomorphic encryption.

Abstract:
In response to emerging regulations on the “right to be forgotten”, federated unlearning (FU) has been proposed to ensure privacy compliance by efficiently eliminating the influence of specific data from federated learning (FL) models. However, existing FU studies primarily focus on improving unlearning efficiency, with little attention given to the potential privacy risks introduced by FU itself. To bridge this research gap, we propose a novel federated unlearning inversion attack (FUIA) to expose potential privacy leakage in FU. This work represents the first systematic study on the privacy vulnerabilities inherent in FU. FUIA can be applied to three major FU scenarios: sample unlearning, client unlearning, and class unlearning, demonstrating broad applicability and threat potential. Specifically, the server, acting as an honest-but-curious attacker, continuously records model parameter changes throughout the unlearning process and analyzes the differences before and after unlearning to infer the gradient information of forgotten data, enabling the reconstruction of its features or labels. FUIA directly undermines the goal of FU to eliminate the influence of specific data, exploiting vulnerabilities in the FU process to reconstruct forgotten data, thereby revealing flaws in privacy protection. Moreover, we explore two potential defense strategies that introduce a trade-off between privacy protection and model performance. Extensive experiments on multiple benchmark datasets and various FU methods demonstrate that FUIA effectively reveals private information of forgotten data.

Abstract:
Machine learning (ML), driven by prominent paradigms such as centralized and federated learning, has made significant progress in various critical applications. However, its remarkable success has been accompanied by various attacks. Recently, the model hijacking attack has shown that ML models can be hijacked to execute tasks different from their original tasks, which increases both accountability and parasitic computational risks. Nevertheless, thus far, this attack has only focused on centralized learning. In this work, we broaden the scope of this attack to the federated learning domain, where multiple clients collaboratively train a global model without sharing their data. Specifically, we present the first-of-its-kind hijacking attack against the global model in federated learning, namely HijackFL. The adversary aims to force the global model to perform a different task (called hijacking task) from its original task without the server or benign client noticing. To accomplish this, unlike existing methods that use data poisoning to modify the target model’s parameters, HijackFL searches for pixel-level perturbations based on their local model (without modifications) to align hijacking samples with the original ones in the feature space. When performing the hijacking task, the adversary applies these perturbations to the hijacking samples, compelling the global model to identify them as original ones and predict them accordingly. Extensive experiments demonstrate HijackFL significantly outperforms baselines, e.g., 92.75% vs. 10%. We further investigate the factors that affect its performance and discuss possible defenses to mitigate its impact. Code is available at https://github.com/zhenglisec/HijackFL

Abstract:
Large Language Model (LLM) inference services like ChatGPT are popular for enabling diverse tasks via prompts, yet they exacerbate privacy risks due to the potential exposure of sensitive data in user inputs. Existing local differential privacy (LDP)-based text sanitization mechanisms offer lightweight protection suitable for cloud-based LLM inference. Nevertheless, uniform privacy budget allocation and generalized sanitization mechanisms neglect the critical protection needs of sensitive user data, such as Personally Identifiable Information (PII). Empirical evidence of this work reveals that even with a strict privacy budget ( \epsilon =0.1), the sensitive information leakage rate can reach an alarmingly high 71.74%. To address these challenges, this paper proposes Rap-LI, a risk-aware privacy preservation framework for LLM inference, designed to be plug-and-play. Rap-LI performs risk identification and personalized labeling on user prompts, then develops a risk-aware LDP mechanism for text sanitization, formally proven to satisfy both token-level and sentence-level LDP guarantees. Extensive experimental results demonstrate Rap-LI’s superior privacy-utility balance. It improves privacy protection against sensitive information leakage by an average of 51.68% compared to methods with comparable utility. Our code is available at https://github.com/Cristliu/RapLI

Abstract:
With the development of deep learning technology, the facial images generated by deepfake technology have reached a level of authenticity that is difficult to distinguish, posing a serious threat to personal privacy and data security. Therefore, it is of great significance to develop efficient and reliable deepfake detection technology. In recent years, Visual Language Models (VLM) have been applied to deepfake detection tasks due to their powerful multimodal understanding capabilities. However, the existing VLM have not been specifically optimized for deepfake detection tasks. When directly applied to this task, there are problems such as insufficient model accuracy and insufficient feature extraction, especially when dealing with complex forgery scenes. In response to these challenges, this paper proposes an innovative deepfake face detection method based on VLM and component-specific prompt tuning. We transform the deepfake detection task into a Visual Question Answering (VQA) task, making full use of the multimodal understanding capabilities of VLM and the flexibility of prompt tuning technology. This method uses a local prompt strategy to customize specific prompt questions for key facial components such as eyes, nose, and mouth, guiding the model to focus on the local features of these areas, thereby accurately capturing forgery traces. In addition, we introduced a feature extraction module Q-Former based on instructions, which can flexibly adjust the focus area of visual features according to prompts, significantly improving the model’s perception of locally forged features. By fusing these local features extracted by Q-Former and combining them with the language model to judge the authenticity of the overall face image, we can finally generate accurate prediction results. A large number of experimental results show that our method is significantly better than existing technologies in terms of detection accuracy and robustness.

Abstract:
Hardware fuzzing has become a compelling automated verification method for efficiently identifying hardware bugs. However, current fuzzers predominantly focus on maximizing overall coverage, often overlooking the coverage of individual modules. This oversight leads to insufficient testing of low-coverage yet functionally critical modules and leaves essential inter-module dependencies unexplored. Consequently, effectively and efficiently verifying processor modules remains an unresolved challenge. In order to achieve focused exploration of low-coverage modules and effectively capture inter-module dependencies, we propose ModFuzz, a novel adaptive module-level processor fuzzer. We divide the processor into modules and dynamically adjust their priorities based on the Nondominated Sorting Genetic Algorithm II (NSGA-II). By selecting the highest-priority module and applying Inter-Module Dependency Matrix (IMDM)-driven seed selection, ModFuzz concentrates fuzz testing on low-coverage modules and high-dependency seeds. We evaluated ModFuzz on five popular open source RISC-V processors and discovered 16 new bugs with varying degrees of complexity, each of which received a CVE assignment. Compared to the representative CPU fuzzers DifuzzRTL and ProcessorFuzz, ModFuzz improves module coverage by an average of 4.35× and 4.44× , respectively, and increases overall coverage by an average of 4.16× and 3.97× . Our experimental results demonstrate that ModFuzz effectively detects processor bugs while significantly enhancing both the module and the overall coverage.

Abstract:
Online social networks (OSNs) offer an abundant and freely available source of images, providing fertile ground for steganographic communication. However, the mandatory lossy operations applied by these platforms—primarily JPEG recompression—make robustness a pressing challenge. Existing robust steganographic methods focus on improving the embedding process, but inevitably compromise security. In this paper, we break this trade-off by proposing, for the first time, a robust cover screening method that enables successful message extraction after JPEG recompression, even when combined with non-robust steganographic methods. To ensure that the screened covers are compatible with arbitrary steganographic settings—including distortion functions, coding schemes, and messages—we introduce Robustness-Minimizing Modification (RMM), which simulates the worst-case impact of steganographic modifications on cover robustness. Images that remain unchanged under JPEG recompression after RMM are screened as robust covers. Our experiments reveal that such robust covers exist widely in both natural and generated images. Therefore, recent advances in generative modeling enable cost-effective and scalable expansion of candidate covers, addressing potential limitations of the screening method in practice. Our experiments also demonstrate that these screened covers can achieve 100% message extraction even with non-robust steganography at high embedding rates, while maintaining security comparable to other covers.

Abstract:
Teleoperated robotics, which translates human behavior into robotic actions, remains a critical area of modern robotics. Although autonomous systems have advanced rapidly, they still struggle in complex and unstructured environments, making human-in-the-loop control indispensable for many real-world tasks. Teleoperation platforms commonly rely on motion-tracking technologies to capture detailed operator behavior, which is subsequently converted into robot control commands. However, these rich behavioral signals can also encode operator-specific biometrics, posing privacy risks such as user re-identification. While prior work shows that behavioral biometrics can be leveraged for reliable authentication, privacy leakage in teleoperation-centric motion streams has received comparatively less attention. To address this gap, we introduce a disentangled representation-learning framework based on a Variational Autoencoder (VAE) to suppress identity-revealing cues while retaining task-relevant motion patterns. We evaluate the proposed approach offline on reconstructed trajectories collected from a tele-robotic prototype, where multiple users perform a set of manipulation tasks. Our results demonstrate a substantial reduction in re-identification risk and a favorable privacy–utility trade-off in terms of task utility. More broadly, our findings highlight the need for robust privacy protections in future robotic teleoperation systems.

Abstract:
In security-critical domains such as autonomous driving and healthcare, deep neural networks (DNNs) often rely on large, diverse datasets that may inadvertently contain backdoor attacks. In this paper, we propose ABDP, a novel post-processing defense that removes backdoor contamination from datasets and produces clean models without requiring any pre-existing clean data. ABDP leverages the intrinsic connection between untargeted adversarial attacks and backdoor behavior to detect the presence of a backdoor in a trained model and infer its target label. It then constructs a clean-label model that recognizes all labels except the inferred target label, thereby treating poisoned samples as in-distribution and clean samples of the target label as out-of-distribution (OOD) data.This distinction enables accurate identification of backdoor-poisoned samples. Finally, ABDP applies unlearning to eliminate the backdoor from the model. Extensive experiments on CIFAR-10, GTSRB, and ImageNet-10 across seven representative backdoor attacks show that ABDP reduces the attack success rate (ASR) to 1% or lower while retaining approximately 70% or more of the clean training data at a false positive rate of 0.01. The resulting models preserve high clean accuracy (ACC), incurring only a small degradation (typically within two percentage points) compared to training on fully clean data. Moreover, ABDP introduces no accuracy loss when applied to purely clean datasets due to its explicit backdoor existence detection. Overall, ABDP provides an effective and practical solution for jointly cleansing backdoor-poisoned data and repairing backdoored models with strong robustness and minimal impact on utility.

Abstract:
The detection of face forgery has become increasingly vital due to the severe security concerns posed by face manipulation techniques. While recent studies on forgery detection have demonstrated promising results when the training and testing samples come from the same domains, the problem remains challenging when attempting to extend the detector to unseen methods. In this work, we propose an innovative approach to enhance the generalization capability of forgery detection methods by exploring degradation inconsistency clues interspersed between the background and the manipulated face regions. Our motivation stems from the observation that digital photos undergo different degradation during acquisition and transmission, resulting in backgrounds and faces from different sources containing distinct degradation patterns in the forged faces. The proposed framework, termed the Degradation Consistency Learning Framework, integrates two core components: a data generation network that modulates degradation transformations to obtain tampered facial images, and a detection network that mines degradation inconsistency clues from both spatial and frequency domains. These two components are tightly coupled through adversarial training, forming a dynamic architecture akin to a Generative Adversarial Network (GAN). Experimental results on different benchmark and evaluation protocols (i.e., in-dataset and cross-dataset) have demonstrated the effectiveness of our method.

Abstract:
Federated learning (FL) has emerged as a popular paradigm for collaborative model training across decentralised data clients while preserving data privacy. However, FL is inherently vulnerable to poisoning attacks. As these attacks grow more sophisticated, various defence mechanisms are proposed to mitigate the threats. Most existing defences adopt a single perspective on Byzantine client detection resulting in both false positives and false negatives. We propose FLgym, a two-stage framework for Byzantine-resilient FL that integrates three components: a model similarity-based detection mechanism, a validation mechanism based on similarity estimation of clients’ local data, and a weight recovery mechanism for identified Byzantine clients. Extensive experiments show that FLgym consistently outperforms state-of-the-art baselines achieving the highest model accuracy, true positive rate of 90.95%, and the lowest false positive rate of 6.3%.

Abstract:
Transferable adversarial examples (AEs) are visually indistinguishable from benign images, but can successfully mislead unknown deep neural networks. However, existing AEs normally vary considerably from benign images in the feature space, making them hard to pass label checking and adversarial detection. Therefore, how to make AEs camouflaged, disguising as benign images during detection is still an open problem. In this paper, we propose a novel camouflaged adversarial attack (CAA), which produces camouflaged adversarial examples (CAEs) for the first time. Our main idea is to make CAEs’ adversarial properties keep “dormant” state until the target model inadvertently triggers the “activated” state. To this end, we craft attack and camouflage perturbations, so that CAEs are visually and feature/label-wise indistinguishable from benign images at first, but will implicitly turn into AEs once being triggered. Specifically, we exploit two common preprocessing operations, image scaling and JPEG compression, as the trigger, and propose a two-stage optimization strategy. As the preprocessing details of target models are unknown, the first stage trains a well-designed generative adversarial network under varying scaling/compression parameters to enhance the robustness of attack perturbations. The second stage uses feature (dis)similarities and contrastive distances to improve the transferability of camouflage perturbations. Extensive experiments on ImageNet dataset validate the effectiveness of CAA. Especially for robust models, the average fooling rate after preprocessing could reach 96.3% outperforming the state-of-the-art adversarial attack by 13.5%.

Abstract:
The maximum likelihood attack strategy is known to be the optimal attack strategy for an eavesdropper in a wiretap channel scenario with additive white Gaussian noise channels under the distinguishing security criterion. The main drawback of this optimal attack is its high computational complexity. While this complexity doesn’t hinder the eavesdropper since he has unlimited computing power, it does present a significant challenge for legitimate parties. For them, it is extremely difficult, if not impossible, to estimate the outcome of the optimal attacker strategy to validate the secrecy of their communication system. In this paper, we introduce a low complexity method for generating upper and lower bounds on the attack performance of the eavesdropper to validate the security against the maximum likelihood attack strategy. We theoretically establish that the derived bounds represent valid constraints on the attack success probability under suitable constraints. The validation method is based on list generation and can be used for any linear block code. Furthermore, we propose a list generation algorithm for this validation method and show different ways to further reduce the complexity. We compare the proposed validation method with state-of-the-art attack strategies in numerical simulations for various error-correcting codes.

Abstract:
Differentially private databases (DP-DBs) offer rigorous privacy guarantees while retaining the utility of data analytics queries. However, ensuring that deployed DP-DBs truly meet these guarantees remains a critical challenge in practice. Improper noise injection or flawed implementations can lead to privacy violations, highlighting the urgent need for auditing services that systematically assess the privacy behavior of DP-DBs—both pre- and post-deployment, much like the extensively studied auditing practices in differentially private machine learning (DP-ML) applications. Compared to DP-ML auditing, auditing differentially private databases poses unique challenges distinct from those encountered in DP-ML auditing. Specifically, the handling of variable query sensitivities and the utilization of diverse privacy mechanisms, such as Laplace noise, require the development of specialized and tailored auditing approaches. In this paper, we introduce \textsf DP\textsf Audit , a comprehensive sensitivity-aware auditing service framework designed to evaluate and verify the privacy guarantees of DP-DBs. \textsf DP\textsf Audit enhances existing auditing capabilities by: 1) incorporating adaptive neighboring dataset generation that reflects real-world query sensitivities, and 2) providing optimized privacy loss estimators for estimating \epsilon for both Laplace and Gaussian mechanisms. Furthermore, \textsf DP\textsf Audit offers an automated noise detection service through statistical hypothesis testing, enabling privacy auditing even in black-box settings. Extensive experimental results demonstrate that \textsf DP\textsf Audit delivers accurate and efficient auditing services, yielding robust estimates of the privacy parameter \epsilon with low computational overhead. Our framework bridges a crucial gap in the deployment pipeline of DP-DBs, empowering developers and users with actionable privacy insights.

Abstract:
Redactable blockchains preserve the integrity of hash links while enabling authorized redactions to comply with regulatory requirements. However, existing permissioned solutions suffer from three severe issues. First, fine-grained privilege control incurs significant storage overhead, especially in the case of single-use authorization. Second, the reliance on bilinear pairings in chameleon hash leads to significant performance degradation when handling large-scale redaction requests. Finally, multiple incorporated components often unconsciously introduce centralized entities, undermining the decentralized nature. In this paper, we propose MithrilRB, a resource-efficient and decentralized redactable blockchain with single-use authorization. Specifically, we introduce a privilege control mechanism with our proposed multi-authority attribute-based signature (MA-ABS) and the threshold BLS signature, achieving fine-grained single-use authorization and direct user revocation without extra ciphertext storage. We also design a pairing-free non-interactive threshold chameleon hash (PNITCH), which enhances efficiency and is better suited for large-scale redaction requests. Moreover, MithrilRB eliminates centralized trust points that hold secret information, ensuring fully decentralization-compatible functional integration in redactable blockchains. Finally, we implement MithrilRB, and the experimental results demonstrate that MithrilRB significantly outperforms existing solutions in both computational efficiency and storage requirements.

Abstract:
The growth in adoption of communication technologies in the power system results in an increase in its vulnerability towards cyberattacks. This paper presents a novel state partition-particle filter (SP-PF) based detection algorithm that dynamically adapts to varying operating conditions. The algorithm partitions state variables into size-restricted blocks, effectively grouping highly correlated variables to enhance computational efficiency and detection accuracy. Our approach consists of two main steps: (i) state-partition estimation of variables and (ii) detection based on likelihood conditions. The proposed detection algorithm was tested in a real-time cyber-physical environment using a real-time digital simulator (RTDS) in hardware-in-loop configuration with PMUs and a synchronization clock, all connected via standard TCP/UDP protocols. Experimental results demonstrate successful detection of false data injection attacks, replay attacks, and hybrid attacks under various operating conditions. Comparative analysis with extended Kalman filter shows that our approach achieves significantly improved accuracy in state estimation with reduced mean square error, enhancing the overall robustness of the detection mechanism.

Abstract:
The rapid advancement of generative models necessitates detection methods that generalize to synthetic images containing diverse generator and semantic artifacts. Recent research has leveraged pre-trained vision-language models, such as CLIP, to extract forensic features that distinguish real and fake images, illustrating their promising performance in synthetic image detection. However, a systematic investigation into the embedding space of CLIP to guide its principled utilization for synthetic image detection remains largely unexplored. This paper addresses this gap by first analyzing the multi-stage CLIP image embedding space to uncover its relationship with cross-artifact forensic patterns. Our findings reveal that the mid-level stages primarily encode forensic and generator artifact features, while the high-level stages primarily encode semantic artifact features. Building upon these insights, we propose the CLIP-guided Dual-level Augmentation and Forensic Distribution Adaptation (CLIP-ADA) framework to perform artifact-invariant generalizable detection. Specifically, dual-level augmentation diversifies fake embeddings and suppresses artifact encoding during training to mitigate detectors from excessively relying on artifact features. Moreover, forensic distribution adaptation reformulates synthetic image detection as identifying distributional deviations from the CLIP encoded real embeddings and thereby designing adapters to extract cross-artifact forensic features in a detection scenario-adaptive manner. Extensive evaluations on both the conventional single-generator and continual learning-based multi-generator training settings demonstrate the effectiveness of our method, both suppressing the state-of-the-art methods by over 6% of average accuracy on unseen data from more than 10 generators.

Abstract:
Anonymous Single-Sign-On (ASSO) enables users to authenticate with an identity server and obtain a master token that grants anonymous access to multiple services. We analyze existing password-based ASSO schemes and identify two fundamental security vulnerabilities. First, an adversary may enumerate potential passwords of a target user and forge valid authentication requests to the identity server. By analyzing the master tokens returned by the identity server using a designated equation, the adversary can recover the user’s password. We refer to this attack as Master Token Password Inference Attacks (MT-PIA). Second, a malicious manufacturer may embed a biased randomness source in users’ devices, causing cryptographic operations to produce predictable outputs. This enables the manufacturer to efficiently recover users’ secrets, which is known as subversion attacks. To mitigate MT-PIA, we propose a secure master token generation mechanism that protects users’ master tokens using two factors: a password and a security key. This mechanism prevents adversaries from forging valid authentication requests and ensures that, even if they intercept master tokens from the identity server, they cannot infer users’ passwords without users’ associated security keys. To counter subversion attacks, we design a cryptographic reverse firewall–based randomness generation mechanism. In this design, a reverse firewall is deployed between each user’s device and the external to assist in generating uniformly distributed randomness. Leveraging these two mechanisms, we develop a security-enhanced ASSO scheme, referred to as SE-ASSO, and conduct a comprehensive evaluation demonstrating its strong security and practicality for real-world deployment.

Abstract:
With the growing storage and diffusion of multimedia data across digital networks, protecting visual content is a subject of interest in research, especially via the means of image obscuration methods. Although several of these techniques have been explored to provide basic to advanced protection against re-identification by humans or automated recognition systems, few achieve full key-based reversibility while introducing minimal distortion to the resulting image. In this paper, we propose a novel image content obscuration method that leverages variational autoencoders to address these limitations of obscuration methods. Our approach transforms high-dimensional images belonging to a source class into lower-dimensional representations (latent vectors), and then applies three distinct transformations to the latent representation in order to match it to the target class. These transformations are designed to be both visually imperceptible and reversible using a secret key, enabling the original content to be accurately reconstructed. We evaluate our method through qualitative and classification-based experiments regarding the obscuration and defense against re-identification, and compare to previous image obscuration methods.

Abstract:
Previous schemes for designing secure branch prediction unit (SBPU) based on physical isolation can only offer limited security and significantly affect BPU’s prediction capability, leading to prominent performance degradation. Moreover, encryption-based SBPU schemes based on periodic key re-randomization have the risk of being compromised by advanced attack algorithms, and the performance overhead is also considerable. To this end, this paper proposes conflict-invisible SBPU (CIBPU). CIBPU employs redundant storage design, load-aware indexing, and replacement design, as well as an encryption mechanism without requiring periodic key updates, to prevent attackers’ perception of branch conflicts. We provide a thorough security analysis, which shows that CIBPU achieves strong security throughout the BPU’s lifecycle. We implement CIBPU in a RISC-V core model in gem5. The experimental results show that CIBPU causes an average performance overhead of only 2.9%–4.0% with acceptable hardware storage overhead, which is the lowest among the state-of-the-art SBPU schemes. CIBPU has also been implemented in the open-source RISC-V core, SonicBOOM, which is then burned onto an FPGA board. The evaluation based on the board shows an average performance degradation of 2.01%, which is approximately consistent with the result obtained in gem5.

Abstract:
Asynchronous federated learning (AFL) accelerates collaborative training across heterogeneous devices compared to synchronous federated learning, but increases vulnerability to Byzantine attacks due to its asynchronous aggregation. Existing defenses rely on parametric similarity between models and temporal consistency of updates, which are compromised by data and device heterogeneity, leading to ineffective robustness. To address this limitation, we propose Belisa, a Byzantine-robust AFL framework that enhances fidelity, robustness, and efficiency under heterogeneous scenarios. Belisa introduces novel discrepancies between feature representations of local models to distinguish malicious models from benign ones. By leveraging a reference model trained on publicly available data, Belisa quantifies these discrepancies, referred to as feature fingerprints, and filters out malicious models through clustering. Extensive experiments on six datasets from three types of tasks under five advanced Byzantine attacks demonstrate Belisa’s superiority. Notably, Belisa consistently outperforms existing approaches across both attack and non-attack settings. Under attack scenarios, it lowers the average test error rate to 0.42× that of baseline methods. Furthermore, Belisa accelerates the aggregation process by an average of 12.3× compared to other methods. To the best of our knowledge, Belisa is the first Byzantine-robust AFL framework, which provides a broadly applicable countermeasure in heterogeneous scenarios which are more prevalent in real-world settings.

Abstract:
Secure multi-client range-query systems enable multiple parties to search a shared, outsourced database without revealing either the queries or the data. The leading primitives are multi-client order-revealing encryption ( \textsf m-ORE ) and its security-enhanced variant \textsf om-ORE , which targets fully malicious clients and server. We show that both schemes remain vulnerable: a colluding malicious client and server can launch a practical ciphertext-forgery attack, silently injecting counterfeit records into the encrypted dataset. To close this gap we propose \textsf MORES , the first multi-client ORE scheme that preserves range-query functionality while provably resisting arbitrarily malicious participants. In addition to its stronger integrity guarantees, \textsf MORES trims query size and comparison cost by roughly one-third relative to both \textsf m-ORE and \textsf om-ORE , as confirmed by experiments in various bit lengths of plaintext. These gains make \textsf MORES an immediate drop-in replacement for encrypted-database systems that demand both efficiency and robustness in adversarial environments.

Abstract:
Side-channel attacks consist of retrieving internal data from a victim system by analyzing its leakage, which usually requires proximity to the victim in the range of a few millimeters. Screaming channels are EM side channels transmitted at a distance of a few meters. They appear on mixed-signal devices integrating an RF module on the same silicon die as the digital part. Consequently, the side channels are modulated by legitimate RF signal carriers and appear at the harmonics of the digital clock frequency. While initial works have only considered collecting leakage at these harmonics, our work has demonstrated that the leakage is also present at frequencies other than these harmonics. This result significantly increases the number of available frequencies to perform a screaming-channel attack, which can be convenient in an environment where multiple harmonics are polluted. This paper studies how this diversity of frequencies carrying leakage can be used to improve attack performance. We first study how to combine multiple frequencies. Second, we demonstrate that frequency combination can improve attack performance and evaluate this improvement according to the performance of the combined frequencies. Finally, we demonstrate the interest of frequency combination in attacks at 15 and, for the first time, at 30 meters in an RF-polluted environment. One last important observation is that this frequency combination divides by at least 2 (and up to 3.76) the number of traces needed to reach a given attack performance.

Affiliations: Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, Wuhan, China; School of Computer Science, Northwestern Polytechnical University, Xi’an, China; School of Cyber Science and Technology, Zhejiang University, Hangzhou, China; School of Software and Engineering, Huazhong University of Science and Technology, Wuhan, China; School of Cyber Science and Engineering, Huazhong University of Science and Technology, Wuhan, China

Abstract:
Federated learning (FL) as a distributed machine learning paradigm can be applied to edge intelligence scenarios for collaborative machine learning model building. Unfortunately, existing privacy-preserving FL applied to this scenario still faces three challenges: data heterogeneity, model heterogeneity, and privacy heterogeneity. Despite numerous privacy-preserving FL techniques proposed, they still cannot effectively address these three challenges. To solve this problem, we propose HeteroFed, a heterogeneous privacy-preserving FL framework for edge intelligence. Our HeteroFed contains heterogeneous model construction, dynamic gradient clipping, adaptive noise addition, and deviation-aware model aggregation. Specifically, we first use the heterogeneous model construction mechanism to enable personalized model training for different smart devices. Then, we propose a dynamic gradient clipping mechanism to perform dynamically adjusted gradient clipping on models uploaded by smart devices to limit the magnitude of gradients. Finally, we propose an adaptive noise addition mechanism to customize differential privacy protection for smart device models based on their convergence status. Furthermore, to mitigate the influence of noise perturbations on model performance, we propose a deviation-aware model aggregation mechanism for accurate model aggregation. Theoretical analysis demonstrates that HeteroFed achieves heterogeneous differential privacy. Extensive experiments show that HeteroFed outperforms similar methods, improving global model accuracy by 18%, 15%, 13%, and 18% on the MNIST, Fashion-MNIST, CIFAR-10, and THUCNews datasets, respectively.

Affiliations: School of Information Science and Engineering (School of Cyber Science and Technology), Zhejiang Provincial Key Laboratory of Digital Fashion and Data Governance, Zhejiang Sci-Tech University, Hangzhou, China; College of Information Science and Engineering, Zhejiang Sci-Tech University, Hangzhou, China; School of Computing and Information Systems, Singapore Management University, Bras Basah, Singapore; School of Information and Communication Technology, Griffith University, Gold Coast, QLD, Australia

Abstract:
Anonymous authentication mechanisms play an increasingly critical role in digital ecosystems by enabling users to prove eligibility without revealing identity information. Anonymous tokens serve as fundamental cryptographic primitives for privacy-preserving access control. However, existing solutions often rely on trusted hardware or suffer from centralization issues, such as single points of failure (SPoF) and strong trust assumptions, to enforce non-transferability. In this work, we construct a threshold BBS+ signing protocol using verifiable multiplication-to-addition (MtA) techniques derived from vector oblivious linear evaluation (VOLE). The security of the proposed threshold signature scheme is rigorously established within the Universal Composability (UC) framework. Building upon this foundation, we introduce the first threshold-authorized and threshold-redeemable, and non-transferable anonymous tokens named T3AT. T3AT enables collaborative issuance and verification in the malicious adversary and dishonest majority settings while achieving non-transferability, unlinkability, and unforgeability without relying on trusted hardware or centralized authorities. Our performance evaluation demonstrates the practicality, efficiency, and scalability of T3AT, effectively bridging the gap between anonymous tokens and threshold-based authorization and authentication for privacy-enhanced access control.

Abstract:
Existing text-image person retrieval methods are built upon fixed-point or distributional embeddings, but they typically perform single-point alignment across modalities, making it difficult to effectively capture the one-to-many cross-modal semantic associations. To address this, we propose a One-to-Many Relation modEling network (OMRE) that explicitly constructs one-to-many semantic matching structures across modalities, thereby modeling richer and more diverse semantic associations. Specifically, to achieve one-to-many matching modeling, we design a bidirectional one-to-many alignment module, which constructs cross-modal matching distributions by aggregating relations between the mean embedding and multiple sampled embeddings, and minimizes their discrepancy with the true distribution to capture complex semantic associations. To construct fine-grained one-to-many matching relationships, we propose a collaborative reconstruction-based similarity refinement module, which maximizes the semantic consistency between multiple reconstructed masked tokens and the original tokens, effectively achieving robust and precise one-to-many cross-modal fine-grained semantic alignment. Moreover, to enhance the discriminative capability of one-to-many semantic distributions, we introduce a Hard Negative Mining mechanism that focuses on semantically similar but mismatched samples, helping to refine distribution boundaries in the probabilistic space and suppress interference from hard negative samples. Extensive experiments on three public datasets demonstrate that our method not only achieves superior overall performance but also exhibits excellent generalization ability. The code will be released on https://github.com/Yifei-AHU/OMRE

Abstract:
Face anonymization protects against facial data misuse but limits its application in scenarios requiring identity authentication. For example, in smart building access systems, users must prove their identity to gain entry while preserving facial privacy. Existing identifiable protection methods address this need, relying on plaintext-based authentication, which exposes users to privacy leakage and creates single points of trust. To overcome these limitations, we propose a novel facial privacy protection framework that supports multi-party secure authentication without privacy leakage. Specifically, we design a bidirectional mapping mechanism (BMM) with an identity mapping module (IMM) and a virtual identity (VID) extraction module (VIEM). IMM generates a VID based on user command. The identity transfer model (ITM) generates virtual face while retaining identity-independent attributes. VIEM extracts the identity from the virtual face and compares it with VID in the encrypted domain. Moreover, the authentication process allows the user to update command and revoke or rebind the VID as needed. The multi-party secure authentication process is implemented with homomorphic encryption. The user and the authentication server interactively perform the authentication process and output four partially decrypted ciphertexts. Other users can then verify the authentication results using these outputs. This multi-party authentication mitigates concerns about single points of failure and enables other users to verify and trust the authentication result. Extensive experiments demonstrate that our framework effectively balances privacy protection, attribute preservation, and authentication accuracy. The proposed method provides a practical solution for privacy-preserving identity authentication and offers promising potential for intelligent systems.

Abstract:
Recently, private and robust federated learning (FL) schemes have been proposed to address privacy inference and Byzantine attacks simultaneously. However, existing schemes are inefficient in private and robust aggregation protocols due to the employment of heavy cryptographic techniques. To approach the above problem, we propose Sanitizer, an efficient, private, and robust FL framework. Specifically, we first design a Byzantine-robust defense for communication-efficient sign-based FL. We further propose a customized private and robust aggregation scheme built on our Byzantine-robust defense for FL. The core of our construction is two new efficient protocols, i.e., high-dimensional boolean summation and weighted boolean majority vote, which serve as the main building blocks of Sanitizer. Extensive evaluations on real-world datasets demonstrate that Sanitizer is blazing fast, achieving 19 ～ 23× less runtime compared to the state-of-the-art. Meanwhile, Sanitizer achieves the same accuracy as the plaintext and superior Byzantine robustness against various classic attacks.

Abstract:
Federated learning (FL) shows great promise in large-scale machine learning but introduces new privacy and security challenges. We propose ByITFL and LoByITFL, two novel FL schemes that enhance resilience against Byzantine users while preventing eavesdroppers from learning users’ private data. To ensure privacy and Byzantine resilience, our schemes are built on having a small representative dataset available to the federator and crafting a discriminator function allowing mitigating corrupt users’ contributions. ByITFL employs Lagrange coded computing and re-randomization, making it the first Byzantine-resilient FL scheme with perfect Information-Theoretic (IT) privacy, though at the cost of a significant communication overhead. LoByITFL, on the other hand, achieves Byzantine resilience and IT privacy at a significantly reduced communication cost, but requires a Trusted Third Party (TTP), used only before training in a one-time initialization phase. We provide theoretical guarantees of privacy and Byzantine resilience, along with convergence guarantees and experimental results validating our findings.

Abstract:
Knowledge graph embedding (KGE) aims to learn high-dimensional representations of graph structures for various downstream tasks. However, knowledge graphs (KGs) are vulnerable to adversarial attacks. The adversary can delete key triplets or inject malicious triplets, thereby compromising the quality of learned representations. Existing target-oriented attack methods rely on the predefined target set, but the adversary can hardly obtain this information in realistic scenarios. In contrast, existing rule-oriented attacks only consider short logic rules, and their reliance on random walk sampling often yields many low-quality rules, degrading attack effectiveness. Surprisingly, we find that reasoning paths can represent multi-hop relations among entities in KGs and provide intuitive explainability. Therefore, we introduce PathAttack, the first path-based explainable untargeted attack framework for KGs. Concretely, a triplet discovery approach is proposed by combining selection using different types of KGE models to replace the predefined target set, which overcomes the limitation of unrealistic assumptions. In addition, we extract and weight the multi-hop relational paths between entities for adversarial deletion and addition, overcoming the bottlenecks of rule length and quality. Extensive experiments on FB15k-237, WN18RR, and the massive OGBL-WikiKG2 benchmark demonstrate the superior effectiveness of our method. Notably, PathAttack is scalable to large-scale KGs containing millions of entities, and improves adversarial attack effectiveness by 5% on WN18RR.

Abstract:
This paper introduces a robust optimization framework for cybersecurity decision-making for turn-based security games over probabilistic attack graphs. We address uncertainties in both the attacker’s state and the effectiveness of controls, proposing a novel approach based on repeated leader-multi-follower games; we introduce a game solution for these games as a minimization of the geometric mean across all possible worlds. We show fundamental mathematical properties of this game solution: (a) it is Pareto optimal, (b) it is equivalent to a standard leader-follower game of the sequence of scenarios, and (c) it is robust. Our framework incorporates budget constraints and leverages game-theoretic and robust optimization techniques for efficient solutions. We validate our approach through experiments, where we show our solutions outperform classic robust optimization solutions like minmax regret. We also present a case study showcasing the meaningfulness of our approach in a network attack scenario.

Abstract:
While ridesharing provides substantial convenience, it also raises several security concerns, with location privacy being a primary issue. A common state-of-the-art solution is to add random noise to user locations to preserve privacy. However, this approach often degrades matching efficiency due to reduced location accuracy. In this paper, we study the real-time matching problem between ridesharing requests and drivers, aiming to maintain high matching efficiency despite obfuscated locations. We model the order dispatching process as an online bipartite matching problem, where drivers are offline and requests arrive sequentially following a known distribution. We construct benchmark linear programs (LPs) and propose an LP-based online matching algorithm with provable performance guarantees. To address privacy concerns, we further develop a privacy-aware LP-based method that mitigates the impact of Laplace noise. Experiments on real-world datasets demonstrate the effectiveness of our algorithms and support our theoretical findings.

Abstract:
The widespread deployment of deep learning models across various applications has raised significant concerns regarding data privacy. Membership inference attacks (MIAs), a major privacy threat, aim to determine whether a specific sample is used during model training, thereby posing significant risks to sensitive information. Most existing MIA methods rely on the model’s final state output, overlooking the process by which the model memorizes training samples. To better exploit model memorization for MIAs, we propose a novel attack method called machine unlearning-based membership inference attack (MU-MIA). The proposed method introduces machine unlearning to incrementally reduce the model’s memorization of specific samples, generating a forgetting trajectory for each sample. The forgetting trajectory is composed of temporal variations in different metrics of the sample during machine unlearning. To distinguish member from non-member samples, we design a BiLSTM-based binary classifier with attention, which captures discriminative temporal patterns within each forgetting trajectory. Moreover, the machine unlearning phase of our attack is conducted under a zero-shot setting, which eliminates the need for any real data during the unlearning process, thereby improving the practicality and generalizability of the attack. We evaluate the proposed MIA method across different datasets and model architectures, and the comparative experimental results show that our method outperforms existing baseline attack methods.

Abstract:
Threshold ECDSA schemes distribute the capability of issuing signatures to multiple parties. They have been used in practical MPC wallets holding cryptocurrencies. However, most prior protocols are not robust, wherein even one misbehaving or non-responsive party would mandate an abort. Robust schemes have been proposed (Wong et al., NDSS ’23, ’24), but they do not match state-of-the-art number of rounds which is only three (Doerner et al., S&P ’24). In this work, we propose robust threshold ECDSA schemes RompSig-Q and RompSig-L that each take three rounds (where the first two are broadcasts, whereas the non-robust scheme of Doerner et al. uses no broadcasts). Building on the works of Wong et al. and further optimized towards saving bandwidth, they respectively take each signer (1.0t+1.6) KiB and 3.0 KiB outbound broadcast communication, and thus exhibit bandwidth efficiency that is competitive in practical scenarios where broadcasts are natively handled. RompSig-Q preprocesses multiplications and features fast online signing; RompSig-L leverages threshold CL encryption for scalability and dynamic participation.

Abstract:
Federated Learning (FL) allows multiple clients to collaboratively train a model without sharing their private data. However, FL is vulnerable to Byzantine attacks, where adversaries manipulate client models to compromise the federated model, and privacy inference attacks, where adversaries exploit client models to infer private data. Existing defenses against both backdoor and privacy inference attacks introduce significant computational and communication overhead, creating a gap between theory and practice. To address this, we propose ABBR, a practical framework for Byzantine-robust and privacy-preserving FL. We are the first to utilize dimensionality reduction to speed up the private computation of complex filtering rules in privacy-preserving FL. Additionally, we analyze the accuracy loss of vector-wise filtering in low-dimensional space and introduce an adaptive tuning strategy to minimize the impact of malicious models that bypass filtering on the global model. We implement ABBR with state-of-the-art Byzantine-robust aggregation rules and evaluate it on public datasets, showing that it runs significantly faster, has minimal communication overhead, and maintains nearly the same Byzantine-resilience as the baselines.

Abstract:
Local Differential Privacy (LDP) enables privacy-preserving data analytics without requiring a trusted aggregator and has attracted significant attention from both academia and industry. For key–value data, PrivKV has been proposed to support frequency and mean estimation under LDP. In PrivKV, the user first samples a key uniformly at random and applies a randomization mechanism to perturb the corresponding value. However, since both Sample and Perturb steps are conducted locally, PrivKV is susceptible to output poisoning attacks, where malicious users bypass these steps and submit crafted data, making the aggregation result biased. To address this vulnerability, we propose VPrivKV, a verifiable LDP protocol designed to defend against output poisoning attacks. VPrivKV enables users and the aggregator to jointly perform the sampling step using a coin-flipping protocol, while the perturbation is enforced through an interactive and verifiable mechanism. Furthermore, we propose an enhanced version of VPrivKV that integrates zero-knowledge proofs to prevent the adversary from forging the discretized value to suppress non-target keys, thereby further enhancing robustness. We theoretically analyze the privacy and robustness of the proposed protocols and conduct numerical simulations to demonstrate their effectiveness in defending against output poisoning attacks.

Abstract:
Graph-level anomaly detection (GLAD) aims to identify graphs that significantly deviate from others in a graph dataset. Existing methods predominantly rely on standard Graph Neural Networks (GNNs) to learn graph representations, but they often overlook subgraph-level information, which provides essential structural and semantic cues for distinguishing normal and anomalous graphs. This limitation not only compromises the detection performance but also hinders the interpretability of GLAD predictions. To address these challenges, we propose NGLAD, a novel framework that introduces the concept of normality-relevant subgraphs that capture shared patterns across normal graphs. These subgraphs serve as key indicators to distinguish normal graphs from anomalies that that often lack or deviate from such patterns. During model training, by explicitly modeling the shared subgraph patterns inherent in normal graphs through a Subgraph Extractor and a Normality Learner, NGLAD identifies the subgraphs most relevant to normality. Leveraging the One-class Information Bottleneck principle, these modules ensure that the extracted subgraphs retain only the most informative features of normality while filtering out irrelevant nodes and edges. During inference, NGLAD detects anomalies by evaluating inconsistencies in representations between the input graph and its extracted subgraph. Extensive evaluations on synthetic and real-world datasets demonstrate that NGLAD significantly outperforms state-of-the-art methods in detection performance while offering interpretable explanations.

Abstract:
Solana is a rapidly evolving blockchain platform that has attracted an increasing number of users. However, this growth has also drawn the attention of malicious actors, with some phishers extending their reach into the Solana ecosystem. Unlike platforms such as Ethereum, Solana has distinct designs of accounts and transactions, leading to the emergence of new types of phishing transactions that we term SolPhish. We define three types of SolPhish and develop a detection tool called SolPhishHunter. Utilizing SolPhishHunter, we detect a total of 8,058 instances of SolPhish and conduct an empirical analysis of these detected cases. Our analysis explores the distribution and impact of SolPhish, the characteristics of the phishers, and the relationships among phishing gangs. Particularly, the detected SolPhish transactions have resulted in nearly \ 1.1 million in losses for victims. We report our detection results to the community and construct SolPhishDataset, the first Solana phishing-related dataset in academia.

Abstract:
Network traffic anomaly detection is critical for cybersecurity but faces challenges in accurately identifying malicious activities. Recent zero-positive approaches, which use only normal training data under the reconstruction paradigm, have shown progress. However, encrypted network traffic obscures normal–anomalous distinctions, causing confused modeling. In addition, the “identical shortcut” problem, where models reconstruct any input with similar fidelity, produces suboptimal representations and indistinguishable detection. To address these limitations, this paper introduces ConMD, a novel Contextual Masking Knowledge Distillation framework. ConMD features distillation paradigm for discriminative representations and then pursues two objectives: effective contextual information modeling and a comprehensive anomaly metric. Specifically, we introduce context-aware local-global attention mechanisms for the student network’s backbone, which capture both intra-packet and inter-packet dependencies. Additionally, a context-enhanced masking training strategy is designed to facilitate contextual interactions in normal flows. Given the structural characteristics of network traffic, we also present a new anomaly scoring with multi-view awareness, which perceive comprehensive traffic patterns. ConMD combines insights from both packet- and flow-level views to highlight deviations in anomalous network flows, thereby improving detection accuracy. Extensive experiments on three real-world datasets validate the effectiveness of ConMD, yielding consistent improvements over state-of-the-art baselines, achieving up to 2.8% and 5.1% AUC gains on the DataCon2020 and CIC-IDS2017 datasets, respectively. Our model code is available at https://github.com/ikun0124/ConMD

Abstract:
This paper proposes an integrated anti-jamming covert communication and sensing system assisted by reconfigurable intelligent surfaces (RIS). By jointly optimizing beamforming vectors and RIS phase shifts, the system maximizes the sum transmission rate while enhancing communication security, sensing accuracy, and anti-jamming capability. We present two comprehensive optimization schemes: a perfect scheme under ideal channel conditions and a robust scheme for practical scenarios. The perfect scheme jointly optimizes beamforming and phase shifts when perfect channel state information (CSI) is available, establishing a performance upper bound. The robust scheme addresses practical transmission challenges by transforming stochastic uncertainties from imperfect CSI and phase shift errors into deterministic constraints through statistical expectation analysis and worst-case formulations, ensuring reliable system performance under realistic conditions. Both schemes effectively solve the resulting non-convex problems through innovative mathematical reformulations using fractional programming, quadratic transformation techniques, and the alternating direction method of multipliers. Comprehensive simulation results demonstrate significant advantages of our proposed framework in communication reliability, sensing accuracy, and resilience against the jammer compared to conventional approaches.

Abstract:
Trusted Execution Environment (TEE) is a primary means for confidential computing. However, at the moment the RISC-V platform is limited for confidential computing because current RISC-V TEEs either lack scalability or compatibility. The reason for this dilemma in scalability and compatibility is that the standard isolation primitive on RISC-V, Physical Memory Protection (PMP), is not scalable. Meanwhile, previous enclave designs depend on the Rich Execution Environment (REE) for OS functionalities, which increases domain switch frequency and enlarges the attack surface of the TEE. In this work, we propose Coffer, a scalable and efficient software-based TEE for the standard RISC-V platform. Coffer includes two core techniques: Logical PMP (LPMP) and Enclave Modules (EModules) to address the issues mentioned above. LPMP is a secure and efficient framework for PMP virtualization. It provides both scalability and hardware compatibility for Coffer. EModules are dynamically assembled lightweight libraries to provide enclaves with OS functionalities. The EModules provide Coffer with software compatibility and reduce the Trusted Computing Base (TCB) size of the enclaves. We implement and evaluate Coffer on commercially available RISC-V devices. The evaluation results show that Coffer can support 2,000 + concurrent enclaves with negligible performance overhead. Particularly, LPMP supports enclave execution under heavy memory fragmentation with little performance overhead.

Abstract:
Ambient backscatter communication (AmBC) has become an integral part of ubiquitous Internet of Things (IoT) applications due to its energy-harvesting capabilities and ultra-low-power consumption. However, the open wireless environment exposes AmBC systems to various attacks, and existing authentication methods cannot be implemented between resource-constrained backscatter devices (BDs) due to their high computational demands. To this end, this paper proposes PLCRA-BD, a novel physical layer challenge-response authentication scheme between BDs in AmBC that overcomes BDs’ limitations, supports high mobility, and performs robustly against impersonation and wireless attacks. It constructs embedded keys as physical layer fingerprints for lightweight identification and designs a joint transceiver that integrates BDs’ backscatter waveform with receiver functionality to mitigate interference from ambient RF signals by exploiting repeated patterns in orthogonal frequency division multiplexing (OFDM) symbols. Based on this, a challenge-response authentication procedure is introduced to enable low-complexity fingerprint exchange between two paired BDs leveraging channel coherence, while securing the exchange process using a random number and unpredictable channel fading. Additionally, we optimize the authentication procedure for high-mobility scenarios, completing exchanges within the channel coherence time to minimize the impact of dynamic channel fluctuations. Security analysis confirms its resistance against impersonation, eavesdropping, replay, and counterfeiting attacks. Extensive simulations validate its effectiveness in resource-constrained BDs, demonstrating high authentication accuracy across diverse channel conditions, robustness against multiple wireless attacks, and superior efficiency compared to traditional authentication schemes.

Abstract:
Federated domain generalization (FedDG) aims to improve the global model’s generalization ability in unseen domains by addressing data heterogeneity under privacy-preserving constraints. A common strategy in existing FedDG studies involves sharing domain-specific knowledge among clients, such as spectrum information, class prototypes, and data styles. However, this knowledge is extracted directly from local client samples, and sharing such sensitive information poses a potential risk of data leakage, which might not fully meet the FedDG requirements. In this paper, we introduce prompt learning to adapt pre-trained vision-language models (VLMs) in the FedDG scenario, and leverage locally learned prompts as a more secure bridge to facilitate knowledge transfer among clients. Specifically, we propose a novel FedDG framework through Prompt Learning and AggregatioN (PLAN), which comprises two training stages to collaboratively generate local prompts and global prompts at each federated round. First, each client performs both text and visual prompt learning using their own data, with local prompts indirectly synchronized by regarding the global prompts as a common reference. Second, all domain-specific local prompts are exchanged among clients and selectively aggregated into global prompts using lightweight attention-based aggregators. The global prompts are finally applied to adapt the VLMs to unseen target domains. As our PLAN framework requires training only a limited number of prompts and lightweight aggregators, it offers notable advantages in terms of computational and communication efficiency for FedDG. Extensive experiments demonstrate the superior generalization ability of PLAN across four benchmark datasets. We have released our code at https://github.com/GongShuai8210/PLAN

Abstract:
Large language models (LLMs) have demonstrated significant utility in a wide range of applications; however, their deployment is plagued by security vulnerabilities, notably jailbreak attacks. These attacks manipulate LLMs to generate harmful or unethical content by crafting adversarial prompts. While much of the current research on jailbreak attacks has focused on single-turn interactions, it has largely overlooked the impact of historical dialogues on model behavior. Although recent studies have explored multi-turn jailbreak attacks, they generally assume that the attacker can only manipulate the user prompt. In contrast, we highlight that an attacker can also control the model’s previous outputs. To this end, we introduce \textsf DIA , a new paradigm that leverages fabricated dialogue history to enhance jailbreak effectiveness. \textsf DIA operates in a black-box setting, requiring only access to the chat API or knowledge of the LLM’s chat template. We propose two methods for constructing adversarial historical dialogues: one adapts gray-box prefilling attacks, and the other exploits deferred responses. Our experiments demonstrate that \textsf DIA achieves state-of-the-art attack success rates on recent LLMs, including Llama-3.1 and GPT-4o. Additionally, we show that DIA can bypass 6 different defense mechanisms, highlighting its robustness. Code is available at https://github.com/meng-wenlong/DIA

Abstract:
Website fingerprinting (WF) attacks are employed to identify websites that utilize Tor encryption. Although State-Of-The-Art (SOTA) WF attacks demonstrate strong performance in single-tab scenarios, they face challenges in multi-tab scenarios. Many multi-tab WF attacks rely solely on direction sequence or process directional and temporal sequence separately. They ignore the coupling between directional and temporal features, which reflects distinct resource-loading processes for different websites. To address the limitations of existing approaches, this paper proposes a new multi-tab WF attack, STMWF. It leverages spatial-temporal sequence analysis and jointly models Inter Arrival Time (IAT) with the direction sequence. STMWF utilizes an SE-attention-based feature extractor to derive features from various website resources within the spatial-temporal sequence. It then employs correlation self-attention to integrate these resource features into their respective websites, ultimately constructing distinct fingerprints for each site. Additionally, the method incorporates correlation denoising to suppress noise in the website fingerprints, thereby enhancing the discriminability of the extracted features. We collected single-tab traces to synthesize a dataset with controlled overlap ratios. We also captured real-world multi-tab traffic with varying tab-opening intervals, evaluating performance under authentic conditions. The experimental results indicate that STMWF significantly outperforms the SOTA multi-tab attacks in both dynamic and static settings. Specifically, it achieves an average F1-score improvement of approximately 14.87% under static conditions and 34.81% under dynamic conditions compared to the SOTA multi-tab WF attack, ARES. Furthermore, STMWF exhibits greater robustness against WF defenses than SOTA attacks and consistently surpasses them across varying overlapping scenarios.

Abstract:
Deep neural networks (DNNs) are vulnerable to backdoor attacks, where an attacker manipulates a small portion of the training data to implant hidden backdoors into the model. The compromised model behaves normally on clean samples but misclassifies backdoored samples into the attacker-specified target class, posing a significant threat to real-world DNN applications. Currently, several empirical defense methods have been proposed to mitigate backdoor attacks, but they are often bypassed by more advanced backdoor techniques. In contrast, certified defenses based on randomized smoothing have shown promise by adding random noise to training and testing samples to counteract backdoor attacks. In this paper, we reveal that existing randomized smoothing defenses implicitly assume that all samples are equidistant from the decision boundary. However, it may not hold in practice, leading to suboptimal certification performance. To address this issue, we propose a certified backdoor defense method with sample-specific smoothing noises, termed Cert-SSBD. Cert-SSBD first employs stochastic gradient ascent to optimize the noise magnitude for each sample, ensuring a sample-specific noise level that is then applied to multiple poisoned training sets to retrain several smoothed models. After that, Cert-SSBD aggregates the predictions of multiple smoothed models to generate the final robust prediction. In particular, in this case, existing certification methods become inapplicable since the optimized noise varies across different samples. To conquer this challenge, we introduce a storage-update-based certification method, which dynamically adjusts each sample’s certification region to improve certification performance. We conduct extensive experiments on multiple benchmark datasets, demonstrating the effectiveness of our proposed method. Our code is available at https://github.com/NcepuQiaoTing/Cert-SSBD

Abstract:
Blockchain consensus can be divided into synchronous consensus and asynchronous consensus according to the network status. In a real network environment, the network status of each node is constantly fluctuating. Hybrid consensus schemes adapt to network fluctuations through switching consensus protocol between asynchronous and synchronous. However, existing schemes are system-level switching, resulting in low efficiency. In this paper, we present Realhybrid, a hybrid consensus scheme with node-level switching, which enables every node to select appropriate consensus protocols based on their network status. We design corresponding protocols for each node to achieve efficient consensus under network fluctuations. Moreover, we establish a Realhybrid network model and quantify the relationship between its performance and system parameters. We conduct experiments on Realhybrid and the results show that it has 29% lower transaction waiting volume and 17% lower transaction confirmation latency compared to other hybrid consensus schemes.

Abstract:
Federated Learning (FL) allows collaborative model training across distributed clients without sharing raw data, thus preserving privacy. However, the system remains vulnerable to privacy leakage from gradient updates and Byzantine attacks from malicious clients. Existing solutions face a critical trade-off among privacy preservation, Byzantine robustness, and computational efficiency. We propose a novel scheme that effectively balances these competing objectives by integrating homomorphic encryption with dimension compression based on the Johnson-Lindenstrauss transformation. Our approach employs a dual-server architecture that enables secure Byzantine defense in the ciphertext domain while dramatically reducing computational overhead through gradient compression. The dimension compression technique preserves the geometric relationships necessary for Byzantine defence while reducing computation complexity from O(dn) to O(kn) cryptographic operations, where k \ll d . Extensive experiments across diverse datasets demonstrate that our approach maintains model accuracy comparable to non-private FL while effectively defending against Byzantine clients comprising up to 40% of the network. Our approach also demonstrates substantial improvements in computational and communication efficiency. Experimental evaluation shows that the dimension compression technique achieves 25 × ～ 35 × reduction in computational overhead and 17 × reduction in communication overhead compared to our non-compression version. When compared to state-of-the-art methods like ShieldFL, our approach demonstrates order-of-magnitude improvements in both computational and communication efficiency while maintaining equivalent privacy guarantees and achieving superior Byzantine robustness comparable to FLTrust. These substantial efficiency enhancements make secure FL practical for deployment in large-scale neural networks with millions of parameters.

Abstract:
Private Set Intersection (PSI) allows two mutually untrusted parties to compute the intersection of their private sets without revealing additional information. In general, PSI operates in a static setting, where the computation is performed only once on the input sets of both parties. Badrinarayanan et al. initiated the study of Updatable PSI (UPSI), which extends this capability to dynamically updating sets, enabling both parties to securely compute the intersection as their sets are modified while incurring significantly less overhead than re-executing a conventional PSI. However, existing UPSI protocols either do not support arbitrary deletion of elements or incur high computational and communication overhead. This work combines asymmetric PSI with Private Set Union (PSU) to present a novel UPSI protocol, which supports arbitrary additions and deletions of elements, offering a flexible approach to update sets. Furthermore, we design a primitive called multi-round OPRF to satisfy the forward security (IEEE TIFS 2024). Our protocol enjoys efficient performance compared to previous work. Specifically, we implement our protocol and compare it against state-of-the-art conventional PSI and UPSI protocols. Experimental results demonstrate that our UPSI protocol achieves up to three orders of magnitude reduction in computational overhead and incurs 190 ～ 707 × less communication overhead than the state-of-the-art UPSI protocol (ASIACRYPT 2024) that supports arbitrary additions and deletions.

Abstract:
The task of unsupervised visible–infrared person re-identification (USL-VI-ReID) aims to retrieve cross-modal pedestrian images without manual annotations. The key challenge lies in achieving semantic alignment to resolve modality bias in the absence of real labels. However, existing methods overly rely on single-modal information in the process of pseudo-label generation without considering cross-modal associations, making it difficult to bridge the modality gap between visible and infrared images. To address these issues, this paper proposes a Bi-level Inter-Modal Modulation Network (BIMM-Net), which employs multi-level cluster structure optimization as a core strategy to drive the establishment of cross-modal semantic associations, ultimately achieving cross-modal alignment at the feature representation level. Specifically, we construct a novel intermediary modality GrayMix from visible images to enhance model robustness against color variations and alleviate modality gaps. To filter out noise in cross-modal matching and establish a shared semantic space between visible and infrared modalities, we further develop a Ternary Pairs Calibration-Convergence module designed for filtering noise from visible-infrared cluster matching, on this basis constructing fused mixture clusters. Building on this mixture cluster space, an Heterogeneous-Isomorphic Alignment Loss is also designed to align the feature distributions of the three modalities, reinforcing cross-modal semantic consistency. In addition, we present a Cross-modal Neighborhood Consistency Clustering method, which facilitates the formation and propagation of cross-modal clusters by selecting high-confidence cross-modal neighbor pairs and refining feature distances. Ultimately, BIMM-Net through the joint modeling of bi-level clustering enables multiple levels to guide each other in refining cross-modal structures, thereby effectively establishing the semantic associations between visible and infrared modalities. Extensive experiments validate the superior performance of the proposed framework, achieving state-of-the-art results in USL-VI-ReID. The source code of this paper is available at: https://github.com/liujuny5920/DIMM-Net

Abstract:
The integration of advanced communication technologies with modern vehicular systems has driven the evolution of vehicular ad-hoc networks (VANETs). These networks enable a seamless exchange of road safety information between vehicles and traffic management authorities via wireless links. However, the open nature of these communication channels introduces significant risks to the privacy and security of transmitted messages. To address these challenges, Li et al. (IEEE Trans. Inf. Forensics and Security, vol. 19, pp. 9629–9642, 2024) proposed a lattice-based authentication scheme designed for fog-assisted VANETs. This protocol utilizes lattice cryptography to ensure resilience against quantum attacks and employs fog computing to tackle scalability issues. Despite these advancements, a detailed analysis uncovers several vulnerabilities and inefficiencies in their design. This study identifies an anonymity disclosure attack on Li et al.’s scheme, compromising its privacy guarantees. In addition, redundancies in the signature generation process impose excessive computational burdens on resource-constrained vehicular devices. To address these shortcomings, this work introduces a lattice-based anonymous batch-verifiable authentication (LBABVA) scheme. Rigorous security analysis proves the scheme’s security in the random oracle model, while efficiency evaluations reveal significant improvements. The proposed scheme reduces the computational cost of the signing phase to 14.57% and the signature verification phase to 83.99% of the corresponding costs of the previous design, highlighting its superior performance and suitability for practical applications.

Abstract:
Event-based state estimation for linear Gaussian systems has garnered significant attention in recent years, with deterministic and stochastic event-based protocols being the most representative. Previous studies have demonstrated that deterministic event-based protocols outperform stochastic ones in the trade-off between communication rate and estimation performance. However, our research reveals that the performance loss in deterministic event-based protocols can surpass that of stochastic event-based protocols for remote state estimation under specific attack scenarios. We explore the impact of insider attacks and derive a closed-form expression for the estimation error covariance under both protocols. Then, we propose a method for designing the attack threshold to meet stealthiness constraints. For scalar cases, we prove that under the same communication rates, the estimation performance of the stochastic event-based estimator outperforms that of its deterministic counterpart under insider attacks. Numerical simulations corroborate that the empirical results align with the theoretical findings.

Abstract:
The need for enhanced transaction privacy in decentralized finance (DeFi) is critical. However, existing coin mixing solutions often reveal telltale patterns on the blockchain, exposing users to heuristic analysis. This paper presents DeFiMix, an indistinguishable coin mixing scheme engineered to obscure transaction flows while guaranteeing fairness and security. DeFiMix achieves this through a dual-layer mechanism. First, an off-chain secret handshake protocol enables anonymous negotiation between senders and mixers, effectively breaking the link between transactions and participants. Second, on-chain transactions are structured using time-locks and concurrent signatures to resemble common DeFi activities such as staking and lending, rendering them indistinguishable from ordinary operations. Using security analysis and extensive simulations, we validate DeFiMix’s ability to prevent transaction linkage while remaining practically viable. The results underscore DeFiMix’s strong indistinguishability and fairness, alongside its minimal computational demands, establishing it as a compelling solution for privacy-focused transactions within the DeFi ecosystem.

Abstract:
Public key encryption with equality test (PKEET) has been widely adopted in applications such as private health record management, secure outsourced data processing, and email filtering, owing to its ability to test equality between ciphertexts encrypted under different public keys. However, existing PKEET schemes often fall short in specialized settings, such as case-insensitive matching. Moreover, their security guarantees remain inadequate. In particular, many existing schemes are vulnerable to offline message recovery attacks (OMRA), which present a significant security challenge to PKEET. Furthermore, the existing IND-CCA security model is incomplete, as it fails to model all potential attacks that an adversary could exploit to execute the OMRA. To address these challenges, we propose a new variant of PKEET, termed public key encryption with case-insensitive fuzzy equality test (PKE-CIFET). To analyze security, we propose a unified security model that more closely aligns with IND-CCA security than previous works. This model includes more comprehensive oracles and considers adversaries launching the OMRA through multi-hop testing. Based on this model, we provide a comprehensive and rigorous security proof. Furthermore, experimental results demonstrate that the proposed PKE-CIFET scheme is efficient in terms of computational cost.

Abstract:
Temporal Forgery Localization (TFL) aims to identify the precise temporal boundaries of manipulated segments within videos. This represents a critical advancement beyond binary video-level forgery detection, because the latter is often insufficient for combating sophisticated partial forgeries that insert synthetic content into otherwise authentic media. The generation pipeline of such forgeries introduces two measurable artifacts: (1) audio-visual asynchrony resulting from imperfect lip-speech synchronization, and (2) abrupt transitions occurring at splice points. Current TFL approaches rely on architectures adapted from semantic tasks that implicitly learn forgery cues, limiting their precision in boundary detection. To address this issue, we propose a novel framework that explicitly quantifies audio-visual asynchrony as a direct signal for localization. Our approach utilizes a Coupled Pyramidal Encoder to extract multi-scale synchronized representations across modalities. These features feed into a Multi-Scale Asynchrony Probe that measures the temporal warping cost required for audio-visual alignment, translating desynchronization into a quantifiable forgery indicator. This measured asynchrony then guides our Context-Aware Boundary Pinpointing module to selectively amplify manipulation-related discontinuities while suppressing benign scene changes. Experiments on LAV-DF and Deepfake1M benchmarks demonstrate that our artifact-centric design achieves state-of-the-art performance, improving high-precision localization (AP@0.95) by up to 27.5 points over previous methods. Furthermore, cross-dataset evaluations on LAV-DF and Deepfake1M demonstrate that our method significantly outperforms SOTA models in domain adaptation scenarios. These results validate that explicitly quantifying asynchrony provides a powerful guiding signal for precise temporal forgery localization. The implementation code of the paper is available at https://github.com/wangzhiyuan120/tfl_asynchrony_pub

Abstract:
Referring video object segmentation (RVOS) is an emerging task that aims to segment the text-referred objects in the given video sequence. This capability plays a critical role in some real-world safety-critical applications such as autonomous driving. However, advanced RVOS models predominantly leverage deep neural networks that are inherently vulnerable to adversarial perturbations, which raises serious safety concerns. Although some studies have explored adversarial attacks on video object segmentation (VOS), the robustness and security of RVOS models against such attacks remain insufficiently investigated. This work thus, for the first time, comprehensively investigates the adversarial robustness of RVOS models. Distinct from other VOS tasks, RVOS is more challenging due to its multi-modal nature and high dependence on spatial-temporal information. Considering that, we propose a cross-prompt Multimodal attack with Inter-Clip Momentum (xM-ICM) to effectively mislead RVOS models under both white-box and black-box scenarios. The proposed xM jointly corrupts visual and textual embeddings and integrates a cross-prompt strategy during iterative optimization to enhance generalization across diverse linguistic queries. The ICM module harnesses the spatial-temporal dependencies across sequence clips via two momentum banks to preserve the perturbation coherence throughout the whole video and stabilize the adversarial optimization. Experimental results on three benchmarks and five prevalent RVOS models demonstrate the superior white-box attack performance and strong black-box transferability of our proposed method.

Abstract:
In recent years, phishing scams have become one of the most rampant criminal activities on Ethereum, causing significant financial losses to investors and disruptions to the Ethereum ecosystem. Existing phishing scam detection methods typically model Ethereum transaction records as graphs, extracting features from paired nodes based on the topological relationships. However, these methods mostly focus on low-order relational aspects, neglecting higher-order structural information in the network. In this paper, we propose a new method — Ethereum Phishing Scam Detection by Higher-Order Topology (EPSD-HOT), which improves phishing scam detection performance by mining higher-order topological features from the network. We conduct experiments on a public dataset and a crawled real-world dataset, extracting ten subgraphs with distinct network characteristics. The experimental results show that the average AUC-ROC for the ten subgraphs is 0.9970, with improvements ranging from 0.0181 to 0.1658 compared to baseline methods. This indicates that our approach is highly robust and can effectively detect phishing scams across different subgraphs while overcoming the issue of class imbalance. By incorporating higher-order structural information into node features, this work offers new insights for enhancing phishing scam detection in Ethereum.

Affiliations: School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China; School of Computer Science, The University of Auckland, Auckland, New Zealand; School of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing, China; School of Information and Communication Technology, Griffith University, Nathan, Australia; School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, China; College of Computing and Data Science, Nanyang Technological University, Nanyang Ave, Singapore; Lingnan University, Hong Kong, China; Beijing Academy of Blockchain and Edge Computing, Beijing, China

Abstract:
The rapid growth of blockchain technology has driven the widespread adoption of smart contracts; however, their inherent vulnerabilities have led to significant financial losses. Traditional auditing methods, while essential, struggle to keep pace with the increasing complexity and scale of smart contracts. Large language models (LLMs) offer promising capabilities for automating vulnerability detection, but their adoption is often limited by high computational costs. Although prior work has explored leveraging large models through agents or workflows, relatively little attention has been given to improving the performance of smaller, fine-tuned models—a critical factor for achieving both efficiency and data privacy. In this paper, we introduce HKT-SmartAudit, a framework for developing lightweight models optimized for smart contract auditing. It features a multi-stage knowledge distillation pipeline that integrates classical distillation, external domain knowledge, and reward-guided learning to transfer high-quality insights from large teacher models. A single-task learning strategy is employed to train compact student models that maintain high accuracy and robustness while significantly reducing computational overhead. Experimental results show that our distilled models outperform both commercial tools and larger models in detecting complex vulnerabilities and logical flaws, offering a practical, secure, and scalable solution for smart contract auditing. The source code is available in the GitHub repository at https://github.com/LLMSmartAudit/FTSmartAudit

Abstract:
Honeywords are plausible-looking decoy passwords associated with each user’s real password to timely detect password leakage. The more indistinguishable the honeywords are, the more secure a honeyword scheme is. However, honeyword schemes that externally generate honeywords can only approximate, but not equate, the distribution of user-chosen passwords, so they are unlikely to achieve the ideal indistinguishability. To address this issue, internally sampled honeyword schemes that sample honeywords from other users’ passwords have been proposed. In this work, we first reveal two critical security and two critical usability flaws in existing internally sampled honeyword schemes, i.e., Honeyindex (TDSC’16) and Superword (COSE’21). We then formalize a generic framework for sound internally sampled honeyword schemes, and propose variants for both Honeyindex and Superword. To principally evaluate the security of our framework, we propose Bayesian and intersection attack theories leading to attackers’ optimal distinguishing strategies, and evaluate them under three major attacker models each with varied capabilities (e.g., using leaked datasets and users’ personal information). Evaluation results show that, when 40 sweetwords are associated with each user (as recommended at IEEE S&P’22), with only one guess per account, the basic attacker’s success rate can reach 3.82%~4.12%, and she can identify 4.31%~5.04% of all real passwords with 10^4 honeyword login attempts, breaking the ideal 2.50%(= 1/40) security. Two more advanced attackers can identify 18.6%~44.8% and 20.6%~43.6% of all real passwords in 10^4 honeyword login attempts, respectively. When multiple password files are available, the intersection attack alone identifies 18.3%~18.6% of real passwords. We also explore the impacts of denial-of-service attacks. In all, this work reveals the inherent insecurity of internally sampled honeyword schemes.

Abstract:
The development of decentralized finance on blockchains leads to increasing high-frequency traders interacting with decentralized applications (DApps) to earn profits. However, DApps have a deficiency during high-frequency interactions: transaction records generated by these interactions are freely accessible on blockchains. If a user has high-frequency interactions with DApps, his transaction patterns can be exposed to attackers. Attackers can predict the conditions under which a user will execute transactions and the parameters a user will use. Due to the transparency of pending transactions, attackers can build honeypots to trick them into performing certain actions, resulting in significant security risks. This study presents the first formal definition of the honeypot in high-frequency interactions on DApps. We introduce matching rules and propose a log-based transaction parsing method to detect honeypots caused by the deficiency. Our method finds that 99 smart contracts are affected by 636 incidents, resulting in losses exceeding 25M USD. When a honeypot transaction occurs, the victim’s transaction is reverted, underscoring the importance of understanding the underlying causes. However, analyzing revert causes directly from transaction input data poses challenges, especially when the victim’s contract is not open-source. To address this challenge, we introduce a novel frequent mining method to analyze the causes of honeypots. The recovery success rate reaches 82.39%. Based on the causes that lead to transaction reverts, we propose potential strategies to mitigate these security risks and validate them in a simulated environment.

Abstract:
Homomorphic encryption-based federated learning (HEFL) strengthens privacy by aggregating encrypted model updates, but it also renders existing backdoor defenses that assume plaintext updates inapplicable. We present HEFLGuard, a single-server backdoor detection framework for HEFL in which the server constructs overlapping validation models from encrypted client groups and clients locally compare logits of the global and validation models on benign samples to expose backdoor behavior. HEFLGuard further combines consistency verification across non-IID validation groups with Byzantine fault-tolerant aggregation of client reports, ensuring robustness under heterogeneous data and Byzantine participants. We evaluate HEFLGuard on seven vision/text benchmarks under three backdoor types across IID and non-IID settings. HEFLGuard consistently reduces ASR from near 100% to nearly the no-backdoor level while keeping the drop in clean accuracy within 2.5%. Compared with prior work, HEFLGuard achieves higher robustness and deployability.

Abstract:
Rapid advancements in video diffusion models have enabled the creation of realistic videos, raising concerns about unauthorized use and driving the demand for techniques to protect model ownership. Existing watermarking methods suffer from two key limitations: they overlook temporal consistency due to conventional watermark decoders and degrade the visual quality of the generated videos. To address these issues, we introduce a robust watermarking method for latent video diffusion models named Latent Video Diffusion Watermarking (LVMark). We propose a novel watermark decoder tailored for generated videos by learning the consistency between adjacent frames. It ensures accurate message decoding, even under malicious attacks, by combining the low-frequency components of the three-dimensional wavelet domain with the color features of the video. Additionally, we train a latent decoder to maintain the visual fidelity of the generated video. Watermarks are embedded into layers with minimal impact on visual appearance using an importance-based weight modulation strategy. We optimize both the watermark decoder and the latent decoder of diffusion model, effectively balancing the trade-off between visual quality and bit accuracy. Our experiments show that our method embeds invisible watermarks into video diffusion models, ensuring robust decoding accuracy with 512-bit capacity, even under distortions.

Abstract:
The secure deployment of Federated Learning (FL) is critically undermined by statistical data heterogeneity and a profound vulnerability to adversarial attacks, these weaknesses are exacerbated by FL’s privacy-preserving preclusion of large-scale, centralized data for robust training. Existing prototype-based methods suffer from representation collapse when naively aggregating from non-IID clients, while generative approaches often lack a principled mechanism for synthesizing features that confer adversarial resilience. We introduce Federated Contrastive Diffusion Prototypes (Fed-CDP), a novel paradigm that transforms the server from a passive aggregator into an active synthesis hub for robust features. Fed-CDP aggregates lightweight client prototypes to serve as semantic anchors, guiding a server-side diffusion model via a contrastive objective. This process synthesizes a high-fidelity feature space explicitly optimized for maximal inter-class separability, a property intrinsically linked to robust generalization. These server-generated features are then distributed to clients as a potent regularizer, aligning disparate local models and directly mitigating client drift. Our extensive evaluations across multiple challenging datasets establish that Fed-CDP outperforms existing state-of-the-art baselines. For instance, on CIFAR-100 under severe heterogeneity ( \alpha =0.1 ), Fed-CDP surpasses leading methods by nearly 5% in standard accuracy and over 9% in robust accuracy under Projected Gradient Descent attacks. Fed-CDP provides a new blueprint for building secure and high-performance collaborative AI, laying the foundation for trustworthy systems in critical sectors like finance and multi-institutional healthcare.

Abstract:
Multi-modal data provides abundant and diverse object information, crucial for effective modal interactions in the Re-Identification (ReID) task. However, existing approaches often overlook the quality variations in local features and fail to fully leverage the complementary information across modalities, particularly in cases where features are of low quality. In this paper, we propose to address this issue by leveraging a novel graph reasoning model, termed the Modality-aware Graph Reasoning Network (MGRNet). Specifically, we first construct modality-aware graphs to enhance the extraction of fine-grained local details by effectively capturing and modeling the relationships between patches. Subsequently, the selective graph nodes swap operation is employed to alleviate the adverse effects of low-quality local features by considering both local and global information, enhancing the representation of discriminative information. Finally, the swapped modality-aware graphs are fed into the local-aware graph reasoning module, which propagates multi-modal information to yield a reliable feature representation. Another advantage of the proposed graph reasoning approach is its ability to reconstruct missing modal information by exploiting inherent structural relationships, thereby minimizing disparities between different modalities. Experimental results on four benchmarks (RGBNT201, Market1501-MM, RGBNT100, MSVR310) indicate that the proposed method achieves state-of-the-art performance in multi-modal object ReID. Our code is available at https://github.com/wanxixi11/MGRNet

Abstract:
The rapid advancement of AI-generated content (AIGC) presents significant challenges for digital forensics, necessitating robust and generalizable detection frameworks. Existing detection methods primarily rely on visual feature extraction, while vision-language model-based approaches are limited to class-label prompts, failing to capture quality-related artifacts introduced by different generative models. To address this limitation, we introduce QAFD, a novel Quality-Assisted Forgery Detection framework that incorporates image quality information into the detection process. Specifically, we design a quality queried attention block to effectively fuse class-based content prompts with quality-aware text prompts. This integration enhances the model’s ability to capture semantic artifacts related to degradation patterns commonly associated with AI-generated images. Furthermore, we introduce the Quality-Guided Forgery Adapter (QGFA) to incorporate quality-aware textual cues into the visual domain, improving feature extraction for both spatial and frequency-based forgery artifacts. This synergy allows frequency cues to enhance low-level artifact perception, while quality-aware guidance strengthens high-level discriminative representation. Extensive experiments demonstrate that QAFD achieves superior generalization to unseen generative models over three datasets and significantly maintains its robustness against common image post-processing operations. The codes will be released at https://github.com/wangjun9276/QAFD\_AIGC

Abstract:
In recent years, advancements in large language models have led to significant innovation and critical progress in AI. However, some of these innovations are raising privacy and security concerns. Machine unlearning has therefore emerged as a potential solution to mitigate such risks. Yet, while erasing data records from traditional models is relatively straightforward, making a large language model “forget” what it has learned is often very challenging. This is not just because they include so many parameters, it is also because the knowledge they possess is intricately entangled. Further, the privacy risk of unlearned data remains neglected in most unlearning solutions. To overcome these limitations, we took advantage of information retrieval and developed an efficient privacy-preserving unlearning mechanism. Our solution eliminates the impact of targeted information by removing high-risk semantic meanings from the model’s output. It also incorporates differentially-private randomization to make the unlearned information statistically indiscernible. Most importantly, the algorithm requires neither parametric fine-tuning nor in-context prompt calibration. A theoretical analysis demonstrates that this method satisfies rigorous privacy and unlearning guarantees. Additionally, experiments on real-world datasets prove that the method is both effective and has the capacity to handle practical unlearning tasks for large language model applications.

Abstract:
Fully homomorphic encryption (FHE) supports computing over encrypted data without requiring decryption, promising the privacy and confidentiality of data. With the rapid development of the Internet of Things (IoT), the SEAL-Embedded library is developed to implement CKKS on resource-constrained embedded devices efficiently. Although FHE is secure against mathematical cryptanalysis, the implementations on embedded devices are vulnerable to physical attacks. In this work, we propose a novel side-channel attack on the SEAL-Embedded library. We analyze the vulnerabilities in the rejection sampling of the SEAL-embedded library. The bit concatenation used for memory compression leads to multivariate leakage of key coefficients, and there is a correlation between the leakages of different coefficients. To enhance the accuracy of coefficient recovery, we design a deep learning model that adapts to the leakage features arising from the bit concatenation implementation for memory compression. Additionally, we construct the factor graph of the rejection sampling procedure to jointly exploit three sources of leakage: the sampled key coefficients, the random bytes before modulo reduction, and the Hamming weight of the concatenated bits, thereby improving the attack success rate on key coefficients. The proposed attacks are evaluated on the ARM Cortex-M4, which is widely used in IoT applications. The experimental results demonstrate that the proposed method achieves a 99% attack success rate in the single-trace attack, representing a 13.4% improvement over the widely adopted CNN-based attack.

Abstract:
The advent of deep learning has accelerated the development of facial manipulation techniques, particularly face-swapping and face attribute editing, raising serious concerns about privacy and identity-related misuse. Existing proactive defense methods predominantly target attribute editing and often generalize poorly to face-swapping models, making it difficult to provide effective protection across both tasks within a unified framework. To bridge this gap, we propose a generalized defense framework, Epoch-Adaptive Adversarial Perturbation Optimization (EA-APO). Specifically, EA-APO introduces a proactive defense mechanism that establishes optimal adversarial paths by optimizing perturbations on a white-box surrogate model to enhance adversarial transferability, and applies the resulting perturbations to source face images to disrupt both face swapping and face attribute editing, even against previously unseen target models in black-box settings. This approach mitigates identity feature tampering while adapting to changes in visual attributes and preserving high-quality adversarial examples. Experimental results show the generalization of our method across multiple face-swapping and attribute-editing models, including commercial ones, while also maintaining strong defense under various common post-processing operations and real-world social media transmission conditions, underscoring its potential for real-world deployment.

Affiliations: College of Cyber Security, the College of Information Science and Technology, and Guangdong Key Laboratory of Data Security and Privacy Preserving, Jinan University, Guangzhou, China; College of Cyber Security, Engineering Research Center of Trustworthy AI, Ministry of Education, Jinan University, Guangzhou, China; Department of Computer Science and Information Engineering, National Dong Hwa University, Hualien, Taiwan; Department of Computer Science, Auckland University of Technology, Auckland, New Zealand

Abstract:
Visual cryptography scheme (VCS) and polynomial-based secret image sharing (PSIS) are two primary types of secret sharing for protecting images. VCS and PSIS have their respective pros and cons. For VCS, the benefits of perfect security and easy decoding are provided. But it suffers from the limitations of lossy secret recovery and binary image-oriented. PSIS can deal with grayscale/color images and offers lossless secret reconstruction. Whereas, the secret decoding is computationally intensive (i.e., \mathcal O(k\log ^2k) for (k, n) threshold) and the residual-image problem in PSIS compromises the security. In this paper, we are motivated to investigate a sharing technique that can preserve the advantages of both VCS and PSIS. Differing from existing VCS and PSIS, the proposed sharing method is accomplished based on the access structure partition (ASP) result. Essentially, an ASP guided image secret sharing approach is developed and three optimal ASP algorithms are designed. When compared with existing partition method, significant improvement is offered by our partition techniques especially for the (k, n) threshold with a larger n . Take the (2, 15) , (2, 18) , and (4, 12) thresholds for example, the numbers of involved sub-access structures by our method are 4, 5, and 19, while the quantities by existing approach are 8, 10, and 45. The percentages of improvement are 100%, 100%, and 137%. Further, based on the partition result from ASP algorithms, we can employ (k, k) probabilistic VCS (PVCS) to constitute a (k, n) sharing method for encoding gray-level/color images. Experiments are demonstrated to confirm the effectiveness of the sharing method and ASP algorithms. Meanwhile, comparisons are included to show that the merits of perfect security, low decoding complexity (i.e., \mathcal O(d) ), lossless secret recovery (i.e., PSNR =\infty , SSIM= 1), and grayscale/color image-oriented are provided by our sharing method.

Abstract:
Machine learning-based methods for encrypted traffic classification can be effectively applied to analyze encrypted proxy traffic generated by proxy protocols, which are intermediary protocols used to route network traffic through a remote server. Nonetheless, different encrypted proxy protocols generate distinct traffic patterns, even when they handle the same network behavior. To address these distribution differences, a straightforward approach is to collect datasets specific to each proxy protocol. However, typical proxy protocols repackage original traffic by encrypting it without payload padding or compression. This leads to a definite characteristic correlation between original and encrypted proxy traffic. We propose an End-to-end Original traffic-based Encrypted Proxy Traffic Classification framework (EO-EPTC) to bridge the distribution gap between original traffic and proxied traffic, enabling the classification of encrypted proxy traffic using a original traffic dataset. EO-EPTC conducts sequence feature alignment to reduce distribution bias and employs a Seq2Seq model to capture the underlying semantics of the proxy protocol, creating a sequence feature transformation model. We apply EO-EPTC to existing encrypted traffic classification models, training them on original traffic to classify proxied traffic. This achieves up to 99.70% accuracy on encrypted proxy traffic, comparable to models trained directly on proxied traffic.

Abstract:
Recent studies have revealed that Deep Neural Networks (DNNs) are highly vulnerable to adversarial examples, which are generated by introducing imperceptible perturbations to clean images, leading to misclassification. The existing untargeted attack usually only focuses on weakening the original class when generating adversarial examples, ignoring the model’s prediction distribution for other classes. Based on the analysis of the attention heatmap of model decision and the existing adversarial attack results, we find that the high-confidence negative classes of the images often reflect the natural weak direction in the model decision, and updating the adversarial examples along this direction is more likely to help it deviate from the original class. Therefore, we propose an untargeted adversarial example generation method via Negative Class Guidance (NCG). First, the logits of the clean image are extracted according to the classification confidence. Second, the soft label is generated via smoothing and normalization operations. Finally, a novel loss function is derived that integrates negative class information with the soft label to guide the update direction of adversarial examples. Extensive experiments conducted on the ImageNet dataset demonstrate that NCG substantially enhances the adversarial transferability of state-of-the-art attack methodologies on both Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), highlighting its effectiveness in black-box attack scenarios.

Abstract:
Blind quantum computation (BQC) enables clients with limited quantum capabilities to protect the privacy of inputs, outputs, and algorithms during the computation process. However, if a client’s private information is exposed to a server after the computation, the server can deduce the client’s output and even input. Quantum encryption with certified deletion (QECD) offers a potential solution by enabling the data owner to generate a deletion certificate, making the original plaintext inaccessible, provided the certificate is valid. Nevertheless, current QECD can only handle classical data and cannot be applied to BQC with quantum inputs. This paper first introduces the concept of certified deletion for quantum states and then proposes a single-client BQC protocol with certified deletion, where the client can use the classical certificate generated by the server to confirm whether her quantum inputs have been deleted after computation. We also give a specific example and simulate it using Qiskit to show its feasibility. In addition, the proposed protocol can be extended to a multi-client environment in which honest clients request certified deletion if any client disconnects or behaves maliciously, thereby terminating the protocol and protecting their privacy.

Abstract:
Dynamic hand gesture authentication (DHGA) has emerged as a promising biometric technology, offering enhanced theoretical security over conventional unimodal systems by combining both physiological and behavioral characteristics. Existing DHGA research predominantly focuses on controlled lab conditions, therefore showing low generalizability to uncontrolled application conditions. To bridge this gap, we propose a novel Skeleton-assistant Standardization and Authentication Framework (SSAF) that incorporates a generic data preprocessing method before authentication. First, we introduce a Geometry-Environment Standardization (GE-Stan) method to standardize five primary geometric and environmental factors inducing data distribution discrepancy, significantly improving robustness across different sessions and scenarios. Notably, the GE-Stan method can be applied to most existing algorithms and brings substantial improvement. Second, we design an Appearance and Motion Network (AM-Net) to fully leverage standardized video and skeleton data. It decouples appearance and motion features using specialized representation and processing strategies. Therefore, our SSAF achieves a flexible balance between accuracy and efficiency, enabling up to 3.6× efficiency boost with only minor accuracy trade-offs. Finally, to support real-world evaluation, we also contribute a new challenging dataset, SCUT-RealDHGA, captured under uncontrolled practical conditions with diverse backgrounds and illuminations. Extensive experiments across three DHGA datasets demonstrate that SSAF outperforms existing methods in terms of accuracy, efficiency, and robustness. The code and dataset are available at https://github.com/SCUT-BIP-Lab/SSAF

Abstract:
The decentralized nature of federated learning (FL) makes it difficult to verify the trustworthiness of participating clients, creating an opportunity for backdoor attacks. This paper addresses a general backdoor-resilient decentralized FL problem without any prior knowledge of the type of backdoor attacks or information about malicious clients. After an in-depth investigation of how backdoor attacks are conducted in FL, we introduce a multi-armed bandit-based knowledge distillation approach to help benign clients learn useful knowledge from other clients while rejecting potential backdoors hidden in shared updates. Unlike most previous works that rely on identifying and removing malicious updates—an approach limited to scenarios with fewer than 50% attackers—our knowledge distillation technique enables benign clients to reject backdoored knowledge while preserving useful information, maintaining effective defense even when malicious clients exceed 50% of the population. Additionally, to handle the various updates from clients with Non-IID dataset, a multi-armed bandit scheme is designed for each benign client to select the most appropriate teachers for knowledge distillation, resulting in high accuracy and fast convergence. Extensive experiments demonstrate that our multi-armed bandit-based knowledge distillation approach achieves high accuracy and general backdoor resilience. Comparisons with previous works show that our approach can reduce the attack success rate by 14.71%~96.78% on average.

Abstract:
The primary objective of jamming strategy optimization is to ensure that a jammer timely finds an effective jamming strategy against the multifunction radar (MFR), thereby ensuring the safety of targets. Deep reinforcement learning (DRL) has been widely applied in solving the problem of jamming strategy optimization. However, the process still faces challenges such as low learning efficiency and a heavy memory burden. Therefore, we propose a fast jamming strategy optimization method with imperfect experience. Firstly, we model the radar countermeasure process as a Markov decision process (MDP), and formulate the jamming reward function by combining the jamming effectiveness and the jammer’s operational intent. Secondly, we design a novel hybrid jamming strategy choice module, which uses imperfect experience to improve the optimization efficiency of jamming strategy. Furthermore, to improve sample efficiency and reduce forgetting caused by a small replay buffer, we respectively employ a mixed replay buffer strategy and a knowledge consolidation technique. Finally, extensive experiments demonstrate that under the guidance of imperfect experience, our proposed method achieves faster convergence speed and higher strategy accuracy compared with existing DRL-based methods.

Abstract:
iPhone portrait-mode images contain a distinctive pattern in out-of-focus regions simulating the bokeh effect, which we term Apple’s Synthetic Defocus Noise Pattern (SDNP). If overlooked, this pattern can interfere with blind forensic analyses, especially PRNU-based camera source verification, as noted in earlier works. Since Apple’s SDNP remains underexplored, we provide a detailed characterization, proposing a method for its precise estimation, modeling its dependence on scene brightness, ISO settings, and other factors. Leveraging this characterization, we explore forensic applications of the SDNP, including traceability of portrait-mode images across iPhone models and iOS versions in open-set scenarios, assessing its robustness under post-processing. Furthermore, we show that masking SDNP-affected regions in PRNU-based camera source verification significantly reduces false positives, overcoming a critical limitation in camera attribution, and improving state-of-the-art techniques.

Abstract:
Face forgery detection suffers from cross-dataset generalization challenges, where performance degradation occurs due to distribution shifts between training and testing data. Recently, pseudo-fake face generation strategy has mitigated models overfitting to specific forgery traces. However, detectors based on this strategy exhibit an overreliance on blending boundary artifacts for their classification decisions. This overreliance significantly limits their ability to generalize to more advanced face manipulation algorithms, such as FaceDancer and InSwap, which are designed to produce smooth and natural transitions in the blending boundary region. To address this, we propose MAP-Mamba, a novel Multi-Artifacts Perception Mamba framework for modeling generalizable artifact representations from “Generation” to “Enrichment” to “Strengthening”. First, we design an attribute-level face blending method that generate pseudo-fake faces containing fine-grained artifacts via three attribute generators. These pseudo-fakes mimic subtle local inconsistencies in advanced forgery algorithms, guiding the MAP-Mamba to learn diverse forgery features beyond the blending boundary artifacts. Second, considering the variability of face artifacts distribution caused by different forgery algorithms, an artifact style mixing strategy is designed to enrich the artifact style distribution in the training phase by mixing and reorganizing the artifact style features, and to enhance the model’s ability to handle unknown forgery methods. Finally, an adaptive artifact guidance mechanism is proposed to dynamically amplify the artifact-related feature to further strengthen the model’s sensitivity to key artifacts. Extensive experiments on several benchmarks show that MAP-Mamba achieves superior robustness and generalization performance.

Abstract:
It is widely believed that increasing the amount of training data enhances the intelligence of deep learning models, which in turn heightens dependence on external datasets. However, these datasets are susceptible to adversarial poisoning, allowing attackers to insert backdoors that trigger misclassifications. Although various defense strategies have been suggested, training-phase defenses (TPDs) appear most promising, as they can significantly lower the attack success rate (ASR) without greatly impacting model performance. Nevertheless, creating TPDs that achieve both high accuracy and low ASR is challenging due to two main issues: 1) Many solutions require additional clean samples that match the distribution of the poisoned dataset, which is not always practical in real-world scenarios; 2) most existing solutions have high computational costs, sometimes requiring five to ten times the expense of standard training, which severely limits their practical use. To tackle these challenges, we introduce Confidence Consistency Detection (CCD), an efficient and lightweight training-phase backdoor detection method. CCD is particularly advantageous in situations where clean data is scarce or unavailable, as it completely eliminates the need for external clean samples. Moreover, CCD significantly reduces computational costs to just 25% to 50% of existing solutions (1.7 times the standard training time), providing a notable improvement over current TPD methods. The core innovation of CCD lies in its ability to utilize the high confidence shown by backdoor samples during the early stages of model training for precise detection. Specifically, we initially train a model on a poisoned dataset for a few epochs, followed by intra-class loss fine-tuning to increase sensitivity to poisoned samples. We then create preliminary sets of poisoned and clean samples by assessing the consistency of confidence variations before and after fine-tuning. These sets guide the model training, enabling the detection of high-confidence poisoned samples. Extensive experiments demonstrate that CCD effectively reduces the attack success rate (ASR) to 1.43%, while having a negligible impact on the model’s clean accuracy. In detecting poisoned samples, CCD achieves a 99% true positive rate (TPR) and a 0.033% false positive rate (FPR), setting a new benchmark in the field.

Affiliations: School of Cyber Science and Engineering, Southeast University, Nanjing, China; Lee Kong Chian School of Medicine, Nanyang Technological University, Nanyang Ave, Singapore; School of Remote Sensing and Information Engineering and the Collaborative Innovation Center of Geospatial Technology, Wuhan University, Wuhan, China; Department of Computer and Information Science, University of Macau, Macau, China; Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China

Abstract:
Contrastive adversarial training emerges as an effective approach to enhancing model robustness in safety-critical applications, particularly point cloud recognition for autonomous driving and medical imaging. However, existing point cloud adversarial training methods mainly emphasize global contrastive learning while overlooking local geometric variations induced by adversarial perturbations. Motivated by the spatial and intensity variations of perturbations across axial views, we propose AVOC, a novel local-global adversarial training framework that utilizes axial-view-oriented contrastive learning. This framework leverages the smallest axial view for local contrastive learning, as it exhibits the highest perturbation differences, and utilizes the largest axial view for global contrastive learning, as it preserves global structural consistency. We conduct comprehensive experiments across four representative architectures, demonstrating significant robustness improvements on widely-adopted recognition benchmarks, including ModelNet40, ShapeNetPart, ModelNet40-C, and ScanObjectNN-C, and further validate its effectiveness on the large-scale KITTI benchmark for 3D object detection. Our results across diverse perturbation scenarios, encompassing white-box attacks, black-box attacks, and natural perturbations, demonstrate the consistent and significant model robustness enhancement of our proposed method.

Abstract:
Encrypted traffic classification is crucial for enhancing network management, service quality, and security. However, real-world network environments are inherently open-world scenarios in which traffic not only consists of known classes but also includes the continuous emergence of unknown classes. Existing deep learning methods typically rely on the closed-world assumption, which significantly limits their classification performance when dealing with unknown traffic types. This limitation makes it challenging to accurately classify known traffic classes and effectively identify unknown ones. Although few studies have focused on open-world scenarios, these methods often use staged strategies and struggle to reliably detect unknown traffic or to estimate novel classes. To address these challenges, we propose an end-to-end Fine-grained Encrypted traffic Classification method based on Open-set Semi-supervised Learning, called FEC-OSL. This method comprises three mutually reinforcing core components. First, we design a dual-branch flow feature extraction module to capture detailed and discriminative flow features. Second, we introduce a novel energy-based perspective that leverages energy-boundary learning to distinguish known traffic from unknown traffic, enabling precise detection of known classes. Finally, an adaptive deep clustering approach integrates feature learning with clustering to achieve fine-grained classification of unknown flows. We conduct extensive experiments on three real-world datasets, and the results validate that our proposed method exhibits outstanding performance in handling both known and unknown encrypted traffic in open-world scenarios.

Abstract:
Large language models (LLMs) are vulnerable to training data extraction attacks due to data memorization. This paper introduces a novel attack scenario wherein an attacker adversarially fine-tunes pre-trained LLMs to amplify the exposure of the original training data. Unlike prior TDE methods that mainly rely on post-hoc querying or prompt selection to elicit memorized content from a fixed model, our strategy directly alters the model’s parameters to intensify its retention of the pre-training dataset. To achieve this, the attacker needs to collect generated texts that are closely aligned with the pre-training data. However, without knowledge of the actual dataset, quantifying the amount of pre-training data within generated texts is challenging. To address this, we propose the use of pseudo-labels for these generated texts, leveraging membership approximations indicated by machine-generated probabilities from the target LLM using DetectGPT. We subsequently fine-tune the LLM via reinforcement learning from human feedback (RLHF) to favor generations with higher likelihoods of originating from the pre-training data, based on these membership probabilities. Our empirical findings indicate a remarkable outcome: LLMs with over 1B parameters exhibit a four to eight-fold increase in training data exposure. We discuss potential mitigations and suggest future research directions.

Abstract:
The rise of wireless technologies has made the Internet of Things (IoT) ubiquitous, but the broadcast nature of wireless communications exposes IoT to authentication risks. Physical layer authentication (PLA) offers a promising solution by leveraging unique characteristics of wireless channels. As a common approach in PLA, hypothesis testing yields a theoretically optimal Neyman-Pearson (NP) detector, but its reliance on channel statistics limits its practicality in real-world scenarios. In contrast, deep learning-based PLA approaches are practical but tend to be not optimal. To address these challenges, we proposed a learning-based PLA scheme driven by hypothesis testing and conducted extensive simulations and experimental evaluations using Wi-Fi. Specifically, we incorporated conditional statistical models into the hypothesis testing framework to derive a theoretically optimal NP detector. Building on this, we developed LiteNP-Net, a lightweight neural network driven by the NP detector. Simulation results demonstrated that LiteNP-Net could approach the performance of the NP detector even without prior knowledge of the channel statistics. To further assess its effectiveness in practical environments, we deployed an experimental testbed using Wi-Fi IoT development kits in various real-world scenarios. Experimental results demonstrated that the LiteNP-Net outperformed the conventional correlation-based method as well as state-of-the-art Siamese-based methods.

Abstract:
Anonymous tokens (AT) have emerged as a critical tool for privacy-preserving authentication. However, state-of-the-art systems face two principal technical limitations: the risk of token transferability, which compromises accountability, and reliance on centralized issuers, which introduces a single point of failure. To address these limitations, we present the first construction of a non-transferable anonymous token system with decentralized issuance (D-NTAT). In particular, our construction supports a dynamic set of issuers, empowering users to obtain tokens by interacting with any subset of the current issuers. The token is publicly verifiable, unlinkable to the issuance process, and, most importantly, non-transferable, even though it is redeemed anonymously. To accomplish this objective, we formalize the notions of D-NTAT, provide a specific construction of a set of protocols from which variants offering enhanced functionalities can be derived, and rigorously prove their security properties. Finally, a proof-of-concept implementation is presented to evaluate their efficiency, which is crucial for blockchain applications such as electronic voting.

Abstract:
Byzantine Fault Tolerance (BFT) protocols are a critical research area in distributed systems and blockchain consensus due to their capacity to deliver high throughput and low latency. Traditional BFT protocols typically rely on a single leader to propose transactions and aggregate votes, which often creates a bottleneck due to the leader’s limited communication and computational capacities. The introduction of multi-leader BFT has the potential to mitigate this issue by increasing system parallelism. However, existing approaches fail to address the challenge of electing multiple leaders and lack a comprehensive analysis of the relationship between the number of leaders, security constraints, and system throughput. In this paper, we study the performance and security of multi-leader BFT protocols. Initially, we introduce a secret multi-leader election method resistant to corruption attacks where selected leaders’ identities remain unknown to others until they proposes transactions. Then, we present specific multi-leader BFT constructions that support a pipelined processing methods, realizing high processing parallelism and optimized throughput. Besides, a cross-leader view-change mechanism is designed for multi-leader BFT to enable efficient replacement of malicious leaders. Furthermore, we analyze the impact of the number of leaders on security and demonstrate that our proposals meet the required security standards. Experimental results reveal that the system achieves a throughput of up to 101 ktx/sec with 128 nodes, highlighting the potential of multi-leader BFT to significantly enhance the performance of blockchain systems.

Abstract:
Recent advancements in Fully Homomorphic Encryption (FHE) have significantly impacted encrypted database management, particularly in secure data querying. However, FHE poses notable performance challenges, especially in processing aggregation queries. These challenges stem from the need to homomorphically evaluate selection predicates via a linear scan over encrypted rows and from the resultant accumulation of redundant data. Overcoming these limitations necessitates innovative solutions that transcend traditional FHE optimizations. We propose a scheme, named Accelerated Homomorphically Encrypted DataBase (AHEDB), to enhance the efficiency of aggregation queries in homomorphically encrypted databases. In the scheme, we employ Encrypted Multiple Maps (EMM) to reduce the computational load inherent in homomorphic operations, thereby achieving significant improvements over existing encryption schemes. We further introduce Single Range Cover (SRC) algorithm for range and equality indexing to address potential security vulnerabilities. We implement our scheme and conduct comparative analyses with FHE-based database system. Our experiments demonstrate that our scheme offers significant advantages in querying overheads, striking a balance between security and efficiency.

Abstract:
Fixed-length fingerprint representations, which map each fingerprint to a compact and fixed-size feature vector, are computationally efficient and well-suited for large-scale matching. However, designing a robust representation that effectively handles diverse fingerprint modalities, pose variations, and noise interference remains a significant challenge. In this work, we propose a fixed-length dense descriptor of fingerprints, and introduce FLARE—a fingerprint matching framework that integrates the Fixed-Length dense descriptor with pose-based Alignment and Robust Enhancement. This fixed-length representation employs a three-dimensional dense descriptor to effectively capture spatial relationships among fingerprint ridge structures, enabling robust and locally discriminative representations. To ensure consistency within this dense feature space, FLARE incorporates pose-based alignment using complementary estimation methods, along with dual enhancement strategies that refine ridge clarity while preserving the original fingerprint modality. The proposed dense descriptor supports fixed-length representation while maintaining spatial correspondence, enabling fast and accurate similarity computation. Extensive experiments demonstrate that FLARE achieves superior performance across rolled, plain, latent, and contactless fingerprints, significantly outperforming existing methods in cross-modality and low-quality scenarios. Further analysis validates the effectiveness of the dense descriptor design, as well as the impact of alignment and enhancement modules on the accuracy of dense descriptor matching. Experimental results highlight the effectiveness and generalizability of FLARE as a unified and scalable solution for robust fingerprint representation and matching. The implementation and code will be publicly available at our GitHub repository https://github.com/Yu-Yy/FLARE

Abstract:
Resource constraints and data heterogeneity pose significant hurdles for malicious traffic detection in satellite networks. To address this, we propose STELLAR, a similarity-based federated learning framework tailored for efficient space computing. STELLAR introduces a multi-dimensional similarity metric to dynamically select representative nodes, effectively eliminating computational redundancy. Furthermore, it ensures system robustness and trust through a time-window asynchronous protocol with staleness compensation and a lightweight proxy-based authentication scheme. Evaluations demonstrate that STELLAR outperforms specialized baselines, reducing network-wide energy consumption by 77%–80% while achieving 99.72% detection accuracy in heterogeneous Non-IID environments. These results validate STELLAR as a sustainable and robust solution for distributed security in resource-constrained satellite networks.

Abstract:
Private protocol reverse engineering is the main way to solve the problem of unknown traffic which brings huge security risks to the current network environment. The network traffic-based protocol reverse engineering approaches are the basis of traffic security supervision and are also widely used and flexible. These approaches utilize multiple algorithms from different perspectives to extract the protocol specifications from messages, but they fail to recognize the importance of message segmentation and do not adequately evaluate the relation of adjacent bytes, leading to imprecise performance. To address these issues, we propose the SLMSP, a self-supervised learning-based message segmentation approach for private protocol reverse engineering in this paper. SLMSP mines the rich information embedded in the word order and word semantics between adjacent bytes through self-supervised learning, and then makes optimal decisions about where the message should be segmented based on the fusion of those information, combing the horizontal inference and vertical correction. After that, SLMSP extracts protocol formats based on fine-grained message segmentation by introducing the progressive sequence merging algorithm. We conduct comprehensive experiments to demonstrate the effectiveness of SLMSP. The experimental results demonstrate that SLMSP achieves the ideal performance both in message segmentation and format inference, and it also has advantages over previous works.

Abstract:
As face recognition systems become more prevalent and various presentation attacks continue to surface, the significance of face anti-spoofing (FAS) has escalated. In real-world scenarios, we can utilize the existing labeled sample sets, and we can also obtain a wide range of unlabeled face samples, which are the target samples that we need to classify. However, the existing cross-domain FAS methods do not fully utilize the target domain data. That is, they only align the overall distribution of features shared by the source and target domains, but cannot complete the alignment of live and spoof features relevant to classification within the source and target domains, resulting in not so good generalization performance in cross-domain scenarios, especially when the target domain is more complex compared to the source domain. To address this issue, we propose a novel domain adaptation approach called Fine-Grained Domain Alignment for Face Anti-Spoofing with Asymmetric Pseudo-Labels (FGDA-APL). In this approach, we initially employ traditional domain alignment methods to achieve preliminary domain alignment, which can be considered as coarse-grained domain alignment. Subsequently, we introduce the Multi-Graph Convolutional Network (MGCN) module, which is utilized to generate asymmetric feature spaces and facilitate cross-supervised pseudo-labels for asymmetric pseudo-labels utilization. Within the MGCN module, features extracted by the feature extractor are guided to achieve feature aggregation, resulting in multiple distinct feature spaces. We hypothesize that pseudo-labels with high confidence in these asymmetric feature spaces can be regarded as reliable pseudo-labels. By cross-supervising the pseudo-labels generated by both the classifier and the MGCN, we ultimately achieve alignment and classification of real and spoofing features within both the source and target domains. Consequently, we achieve superior classification performance on target domain data. Our proposed method has demonstrated state-of-the-art performance across multiple public datasets through extensive experiments.

Abstract:
Recently, growing interests are developed in optimizing fully homomorphic encryption (FHE) circuits to enable Boolean function evaluations over ciphertexts. While existing works utilize functional bootstrapping (FBS) to efficiently evaluate logic gates, the evaluation efficiency for large-scale circuits remains limited. Recent advances introduce a fast ciphertext conversion method, making it feasible to evaluate look-up tables (LUTs) over homomorphic multiplexer operation. In this work, we propose a new circuit synthesis framework, SALUS, which automatically generates and evaluates gate-level graphs over homomorphic LUTs given an input Boolean circuit. We apply the binary decision diagram (BDD) reordering method and multi-value refresh techniques to efficiently evaluate complex LUTs. Additionally, we propose a heuristic algorithm to merge LUTs in a given circuit into multi-output LUTs. In the experiments, we examine the efficiency of SALUS using a wide range of benchmark suites, including the EPFL and ISCAS benchmark circuits. We show that SALUS achieves a maximum reduction of up to 26× in computational latency compared to state-of-the-art homomorphic circuit synthesis method. Furthermore, we evaluate real-world applications, e.g., image filtering and matrix multiplication, and achieve an average speedup of 8.6× (with a maximum speedup of 24× ) compared to the FBS-based method.

Abstract:
Channel state information (CSI) is known to be crucial for both enhancing the transmission performance and ensuring physical-layer security (PLS) in wireless communication systems. To estimate a channel’s CSI, the transmitter (Tx) typically broadcasts a predetermined pilot signal, then the receiver (Rx) computes the channel coefficients based on the received pilot signal and returns the estimated CSI to the Tx. Most, if not all, of existing communication algorithms simply assume that the fed-back CSI is reliable/secure. However, in practice, a malicious terminal may send falsified CSI to the infrastructure, thus compromising the throughput and/or security of the communication over the channel. Although some researchers have already identified this vulnerability, demonstrated the feasibility of the CSI-forgery attacks, and designed countermeasures thereof, their methods either i) are tailored to specific types of attacks, thus lacking generality, or ii) require modifications to the pilot sequence and hence the protocol. To counter the CSI-forgery attacks and remove/mitigate the deficiencies of existing countermeasures, we first develop a comprehensive CSI-forgery model that can subsume the existing CSI-forgery attacks as special instances to facilitate the design of general countermeasures. Then, we propose a novel approach, called SecCSI, to detect potential CSI-forgery activities and identify their initiators using reconfigurable intelligent surface (RIS). SecCSI leverages the RIS to secretly and dynamically modify the wireless environment transparently to the receiver (Rx) in which the pilot signal is transmitted. The infrastructure can, therefore, detect any attempted manipulation of CSI by appropriately configuring the reflection coefficient matrix of the RIS, transmitting the pilot signal, and analyzing all CSI feedback. SecCSI can serve as a guard module for existing communication systems that simply accept the fed-back CSI without checking its trustworthiness. Our theoretical analysis, experimental and numerical evaluations have shown SecCSI to effectively detect the CSI-forgery attacks and identify the attacker.

Abstract:
Large vision-language models (LVLMs) have demonstrated remarkable capabilities across a wide range of multimodal understanding and reasoning tasks. However, recent research shows that LVLMs are susceptible to adversarial examples. Existing attackers either optimize the perturbations on the visual input or manipulate prompts to fool the LVLM models, requiring extensive design and engineering on these adversarial manipulations. While straightforward visual transformation can boast training generalization-ability, its potential risks to LVLMs in terms of safety and trustworthiness have been largely neglected. In this paper, we ask an intriguing question: can simple yet easy-to-implement adversarial visual transformations be utilized to attack the LVLM models? Motivated by this research gap and new attack setting, we propose the first comprehensive assessment of LVLMs’ adversarial robustness to visual transformations by testing LVLMs’ resilience to all possible transformation operations. Our empirical observations suggest that with the appropriate combination of the most harmful transformations, we can build transformation-based attacks more adversarial to the LVLM models. Moreover, adversarial learning of visual transformations is further introduced to adaptively apply the malicious impacts of all potentially harmful transformations to the raw images via gradient approximation for improving the attack effectiveness and imperceptibility. We hope that this study can provide deeper insights into the potential vulnerability of LVLMs to adversarial visual transformations.

Abstract:
Website fingerprinting (WF) attacks, which covertly monitor user communications to identify the web pages they visit, pose a serious threat to user privacy. Existing WF defenses attempt to reduce attack accuracy by disrupting traffic patterns, but attackers can retrain their models to adapt, making these defenses ineffective. Meanwhile, their high overhead limits deployability. To overcome these limitations, we introduce a novel controllable website fingerprinting defense called TrapFlow based on backdoor learning. TrapFlow exploits the tendency of neural networks to memorize subtle patterns by injecting crafted trigger sequences into targeted website traffic, causing the attacker’s model to build incorrect associations during training. If the attacker attempts to adapt by training on such noisy data, TrapFlow ensures that the model internalizes the trigger as a dominant feature, leading to widespread misclassification across unrelated websites. Conversely, if the attacker ignores these patterns and trains only on clean data, the trigger behaves as an adversarial patch at inference time, causing model misclassification. To achieve this dual effect, we optimize the trigger using the Fast Levenshtein-like distance to maximize both its learnability and distinctiveness from normal traffic. Experiments show that TrapFlow significantly reduces the accuracy of the RF attack from 99% to 6% with 74% data overhead. This compares favorably against two SOTA defenses: FRONT reduces accuracy by only 2% at a similar overhead, while Palette achieves 32% accuracy, but with 48% more overhead. We further validate the practicality of our method in a real Tor network environment.

Abstract:
In this paper, we propose a hyperparameter-specialized adaptive fingerprinting framework named AdaParse for model reverse engineering, which aims at predicting hyperparameters of interest in generative models from the given AI-generated images. Existing methods rely on a single coarse model fingerprint that is originally designed for model-level attribution, which makes it difficult to distinguish fine-grained traces corresponding to different hyperparameter configurations in a multitude of generative models. To address this, our AdaParse dynamically responds to instance-level variations by estimating hyperparameter-specific fingerprints via personalizing estimation networks tailored for each input image. Specifically, our approach simultaneously learns two-branch hypernetworks that balance instance-aware and model-agnostic prior knowledge for fingerprint generation. To enable efficient network personalization, we further propose a Broadcasted Fusion module that transforms condensed feature codes into adaptive parameters through factorized weight generation with enhanced representative capacity. Extensive experiments on the large-scale public dataset across 123 generative models demonstrate that our approach outperforms previous state-of-the-art methods. Code available at https://github.com/lizhuoxun/AdaParse/

Abstract:
Cross-chain transfer significantly enhances asset management among multiple blockchains. While existing cross-chain transfer schemes facilitate transferring asset ownership of users between a source blockchain and a target blockchain, they do not consider a complex scenario involving m source blockchains and one target blockchain (termed m ～ 1 cross-chain transfer). Furthermore, the variants of existing schemes for this scenario fail to concurrently achieve atomicity, unlinkability, and practical properties such as transaction aggregability, and non-collateralization, posing substantial limitations. To bridge this gap, we propose ShiftHub, the first m ～ 1 cross-chain transfer scheme that simultaneously fulfills the above desired properties. ShiftHub guarantees atomicity and transaction aggregability by integrating exchange protocols with a novel t -message-clustered redeem protocol. Unlinkability is achieved by additionally incorporating commonly used assumptions such as fixed-amount transfers, while the need for sender collateralization is eliminated by enabling the reuse of locked assets. Compatible with any blockchains that employ adaptor signatures or BLS signatures for transaction authentication, ShiftHub requires only signature verification and timelock functionalities on chain, ensuring near-universal blockchain compatibility. We formally prove the security of ShiftHub under static corruption using the universal composability framework. Extensive evaluation shows that ShiftHub offers superior performance over the existing scheme variant with the closest underlying protocols to ShiftHub, in terms of both off-chain and on-chain overhead, when completing m ～ 1 cross-chain transfer. Particularly, ShiftHub reduces the number of required transactions by (2m-1)/3m .

Affiliations: School of Information Technology and Artificial Intelligence, Zhejiang University of Finance and Economics, Hangzhou, China; School of Cyberspace Science and Engineering, Nanjing University of Science and Technology, Nanjing, China; The University of Western Australia, Albany, Australia; Department of Electrical and Electronic Engineering, The Hong Kong Polytechnic University, Hong Kong, China; School of Computing and Information Technology, Institute of Cybersecurity and Cryptology, University of Wollongong, Wollongong, Australia

Abstract:
Federated Learning (FL) enables collaborative machine learning training while preserving data privacy. However, reliance on a central server of the typical FL confronts the risk of single server failure. Decentralized Federated Learning (DFL) emerges as a promising distributed framework, allowing clients to directly share models without server intervention, thereby addressing this challenge. Nevertheless, due to its decentralized nature, DFL is highly susceptible to Byzantine attacks orchestrated by malicious clients. Existing Byzantine-resilient DFL algorithms, though few, remain vulnerable to adaptive attacks due to their heavy reliance on gradient checks of local models, which can be adaptively manipulated by intelligent adversaries. To tackle this issue, we propose a DFL aggregation scheme called FORCE (Byzantine-Resilient Decentralized Federated Learning via Game-Theoretic Contribution Aggregation). Drawing inspiration from the Shapley value in game theory, FORCE shifts from gradient-checking approaches to employ a universal metric, the loss of the local model—independent of specific gradients, to identify potentially malicious clients. To reduce the computational overhead of FORCE as the number of neighboring clients scales up, we propose a computation-lightweight variant, FORCE ^- , which is optimized through approximating Shapley value computation. This variant becomes more scalable for resource-restricted DFL clients that are also aggregators. Experimental results on four diverse datasets under three attacks demonstrate that FORCE outperforms existing state-of-the-art Byzantine-resilient DFL aggregation methods, effectively defending against Byzantine attacks.

Abstract:
Transaction propagation delay limits the block interval and is one of the main bottlenecks in improving Bitcoin throughput. However, transaction relay in Bitcoin is entirely voluntary, which results in low bandwidth and high transaction propagation delay. Improving relay motivation by introducing incentives can effectively reduce delay, but it still faces challenges such as Sybil attacks during reward allocation, leakage of network-layer privacy, and high on-chain/off-chain overhead. Therefore, this paper proposes Txtail, a practical transaction relay incentive scheme for Bitcoin, based on continuously attaching relay evidence representing the relayers’ identity and contribution during transaction propagation. We employ a free pricing mechanism based on the game between relayers to allocate rewards fairly. We design an order-insensitive relay evidence structure based on aggregate signatures and public key mapping, which reduces off-chain data overhead while alleviating the leakage of relay paths by obfuscating the relay order. We construct a verifiable lottery mechanism based on Merkle tree commitments to reduce the data that needs to be uploaded to the chain. Both theoretical and experimental results show that Txtail reduces the per-hop off-chain overhead and the overall on-chain overhead by 96.6% and 79.8%, respectively, compared with state-of-the-art baselines, while remaining practical for deployment.

Abstract:
The growth of communication technologies has led to a corresponding increase in the need for information security. The use of covert communication techniques has the potential to significantly reduce the likelihood of the transmitted signal being intercepted by unintended receivers, thereby significantly increasing the security of the information. In this paper, we propose an energy-dispersed random time-hopping (ED-RTH) covert communication scheme. Building on existing time uncertainty schemes, this scheme further enhances time uncertainty through random time-hopping, while dispersing signal energy to achieve stronger covertness. We present a model for analyzing the covertness of the ED-RTH scheme. Given the complexity of the likelihood ratio distribution, the Kullback-Leibler (KL) divergence is used to analyze the lower bound of Willie’s detection error probability. Through constructing behavioral-level simulations for validation, we investigated the impact of the number of dispersed time slots K on both Willie’s detection error probability and the system’s covert throughput. Finally we compared the proposed ED-RTH scheme with existing time uncertainty scheme, and the results demonstrate that the ED-RTH scheme achieves significantly enhanced covertness.

Abstract:
The rapid evolution of generative AI has intensified the threat of realistic audio-visual deepfakes, demanding robust and generalizable detection methods. Existing solutions primarily address unimodal (e.g., audio, visual) forgeries but struggle with multimodal manipulations due to inadequate handling of heterogeneous modality features and poor cross-dataset generalization. We propose FauForensics, a novel framework leveraging biologically invariant facial action units (FAUs), which are quantitative descriptors of facial muscle activity linked to emotion physiology. They serve as forgery-resistant representations that reduce domain dependency while capturing subtle synthetic-content disruptions. In addition, unlike prior clip-level comparisons, our method computes frame-wise audio-visual similarities via a fusion module with learnable cross-modal queries, dynamically aligning lip-audio relationships and mitigating feature heterogeneity. Experiments on four publicly available datasets show state-of-the-art performance with 5.17% average cross-dataset improvement over existing methods.

Abstract:
In a scenario where an issuer wishes to issue an attribute-based anonymous credential to a user, this issuance is conditional on a number of real-world outcomes. These outcomes involve multiple entrusted oracles confirming the occurrence of several events, after which the issuance can proceed successfully. Such contractual credentials can serve as an important building block for blockchain-based Web 3.0 systems and can be used in real-world applications that require privacy-preserving, pre-scheduled authorization. However, there is currently no work that enables the pre-issuance of credentials based on oracles and events. In this work, we propose contractual anonymous credentials, called FlyCred, to fill this gap. With FlyCred, the issuer can issue an encrypted credential to a user, controlled by a dual-layer authorization policy consisting of oracle-based and event-based expressive policies. As core building blocks, we introduce two novel cryptographic primitives: the Adaptor Anonymous Credential and ABE-based Signature Witness Encryption with Tags, which can serve as independent interests. We provide efficient instantiations of these primitives and evaluate their performance under different security levels and system parameters on a laptop, showing that the computation and communication overhead of the credential pre-issuance is less than 85.8 seconds and 8.7 MB, respectively.

Abstract:
Owing to its inherent attributes of high security, privacy preservation, and liveness detection, vein recognition has garnered significant attention, with deep learning (DL) models prevailing in the field. In particular, Mamba, a recent DL architecture showing robust feature representation with linear computational complexity, has been applied successfully for visual tasks. However, Vision Mamba captures long-distance feature dependencies but deteriorates local feature details. Besides, manually designing Mamba architecture based on human prior knowledge is very time-consuming and error-prone. To address these limitations, we propose a hybrid network structure named Global-local Vision Mamba (GLVM) to learn both local correlations and global dependencies within images for comprehensive vein feature representation. Second, we design a Multi-head Mamba to learn the dependencies along different directions, so as to improve the feature representation of Vision Mamba. Third, to learn complementary features, we propose a ConvMamba block consisting of three branches: Multi-head Mamba branch (MHMamba), Feature Iteration Unit branch (FIU), and Convolutional Neural Network (CNN) branch, with FIU aiming to fuse convolutional local features with Mamba global representations. Finally, we propose a Global-local Alternate Neural Architecture Search (GLNAS) method, which alternately searches for the optimal architecture of GLVM through weight entanglement strategy and evolutionary algorithm. We have carried out rigorous experiments on five public vein datasets to assess performance. Our approach achieves the highest 96.84%, 99.63%, 95.73%, 99.72%, 99.14% accuracies and the lowest 0.27%, 0.07%, 0.48%, 0.07%, 0.12% EER among all existing approaches on five public vein datasets, which demonstrates that our approach is capable of learning more complete features than existing approaches. In addition, the visual assessment experiments also show that our approach extracts more global vein architecture and local vein detail for recognition.

Abstract:
This work proposes an encrypted controller framework for closed-loop control systems with nonlinear dynamics over fully homomorphic encryption (FHE). Unlike differential privacy and output masking, FHE is a cryptographic primitive that provides assumption-based confidentiality guarantees under standard hardness assumptions. We observe that existing encrypted control frameworks remain largely limited to linear open-loop systems, primarily due to two key challenges: rapid ciphertext noise accumulation in feedback loops and the substantial computational overhead of nonlinear operations. In control systems, feedback is essential for real-time error correction, while nonlinear characteristics are critical for accurately modelling complex system behaviours. To address these challenges, we propose ENClose, a novel encrypted control framework that enables low-latency execution of both feedback control and nonlinear function evaluation. Specifically, ENClose introduces a low-latency homomorphic nonlinear computation framework that accelerates functional bootstrapping (FBS) by combining function segmentation with tree-based encrypted selection. This framework not only mitigates noise accumulation in encrypted feedback loops but also significantly improves the efficiency of FBS under high-precision settings, meeting the computational demands of dynamic control systems. Experimental results show that ENClose achieves a 3× to 20× speedup over state-of-the-art encrypted controllers. We validate ENClose through real-world applications, including multi-vehicle formation, spring–mass–damper control, and anomaly recovery, where the results demonstrate high-precision tracking and successful reconvergence after anomalies.

Abstract:
Recent text-to-image (T2I) models have exhibited remarkable performance in generating high-quality images from text descriptions. However, these models are vulnerable to misuse, particularly generating not-safe-for-work (NSFW) content, such as sexually explicit, violent, political, and disturbing images, raising serious ethical concerns. In this work, we present PromptGuard, a novel content moderation technique that draws inspiration from the system prompt mechanism in large language models (LLMs) for safety alignment. Unlike LLMs, T2I models lack a direct interface for enforcing behavioral guidelines. Our key idea is to optimize a safety soft prompt that functions as an implicit system prompt within the T2I model’s textual embedding space. This universal soft prompt ( P_ ) directly moderates NSFW inputs, enabling safe yet realistic image generation without altering the inference efficiency or requiring proxy models. We further enhance its reliability and helpfulness through a divide-and-conquer strategy, which optimizes category-specific soft prompts and combines them into holistic safety guidance. Extensive experiments across five datasets demonstrate that PromptGuard effectively mitigates NSFW content generation while preserving high-quality benign outputs. PromptGuard achieves 3.8 times faster than prior content moderation methods, surpassing eight state-of-the-art defenses. Rigorous evaluation using both multi-head classifiers and VLM-based guardrails confirms its robustness, achieving an optimal average unsafe ratios down to 5.84% and 6.18%, respectively. Our code and dataset are available at https://t2i-promptguard.github.io/

Abstract:
As RF sensing increasingly moves toward real-world deployments, critical concerns emerge around unauthorized model usage and access control. Existing protection approaches-such as watermarking-are typically reactive, task-specific, and ineffective at runtime. This paper presents AuthRF (Authentication with passports for RF sensing models), a novel signal-level passport mechanism that proactively enforces access control by mapping a user-specific passport to phase-compensation weights in the signal processing pipeline. Valid passports yield coherent phase alignment and high-fidelity representations, while invalid or forged ones induce phase distortion that significantly degrades model performance. This design effectively deters unauthorized access, supports scalable multi-user authentication, and enables personalized service provisioning through controlled passport variation. We evaluate AuthRF on six representative RF sensing tasks using both WiFi and radar signals. Experimental results demonstrate its robust protection capabilities and seamless integration with existing sensing pipelines, positioning AuthRF as a practical foundation for secure and commercial-grade RF sensing deployment.

Abstract:
Source code vulnerabilities represent a critical threat to software security, potentially leading to severe consequences such as data breaches and system failures. Traditional static analysis tools, while widely used, suffer from high false positive rates and struggle to adapt to the increasing complexity of modern software. Deep learning-based approaches hold promise for automated vulnerability detection, but they face challenges including limited dataset quality, inadequate feature extraction, and lack of precise vulnerability localization capabilities. To address these limitations, we propose MaliVD, a novel vulnerability detection method in source code that leverages a multi-modal attention mechanism. MaliVD not only identifies vulnerability types but also pinpoints the specific lines of code where vulnerabilities are triggered. The model extracts sequential, tree-based, and graph-based features from source code and employs specialized neural networks to learn these diverse representations. By strategically focusing on Points of Interest within the code, MaliVD effectively prioritizes potentially vulnerable code regions, enhancing both detection accuracy and localization precision. Experimental results show that when compared with eight advanced vulnerability detection models across three large datasets, MaliVD demonstrates superior vulnerability detection and localization capabilities, maintaining high F_1 scores and localization precision. Particularly on the ReliVul dataset, the F_1 score is improved by 21.82%, and the Top-5 localization accuracy is 18% higher than other methods, with lower false positives across six mainstream vulnerability types, validating MaliVD’s practical application value in real-world environments.

Abstract:
Vehicular Ad Hoc Networks (VANETs) are the cornerstone of intelligent transportation systems and autonomous driving. Vehicle-to-road communication, as one of the core services, faces increasing risks of privacy breaches. Signcryption technology effectively ensures secure information transmission. However, existing signcryption schemes still have deficiencies in terms of transmission robustness and identity privacy protection. To solve these issues, this paper proposes a Robust Identity-based Signcryption scheme (RIBSC) for VANETs. In RIBSC, we first design an area session key distribution mechanism based on Chinese Residual Theorem (CRT), which can dynamically revoke the decryption ability of malicious Roadside Units (RSUs) in real time. Only RSUs approved by Trusted Detection Center (TDC) can obtain a valid session private key by conducting one modular operation. We then utilize the traceable pseudonym mechanism to protect the identity privacy of vehicles and RSUs, which can track their true identities when illegal activities occur. We finally provide a rigorous security proof under the random oracle model, and demonstrate the performance advantages of RIBSC through extensive experiments. More attractively, the session information is fixed at only 148 bytes, regardless of the number of RSUs.

Abstract:
Recent advances in deepfake forensics have primarily focused on improving the classification accuracy and generalization performance. Despite enormous progress in detection accuracy across a wide variety of forgery algorithms, existing algorithms lack intuitive interpretability and identity traceability to help with forensic investigation. In this paper, we introduce a novel DeepFake Identity Recovery scheme (DFREC) to fill this gap. DFREC aims to recover the pair of source and target faces from a deepfake image to facilitate deepfake identity tracing and reduce the risk of deepfake attacks. It comprises three key components: an Identity Segmentation Module (ISM), a Source Identity Reconstruction Module (SIRM), and a Target Identity Reconstruction Module (TIRM). The ISM segments the input face into distinct source and target face information, and the SIRM reconstructs the source face and extracts latent target identity features with the segmented source information. The background context and latent target identity features are synergetically fused by a Masked Autoencoder in the TIRM to reconstruct the target face. We evaluate DFREC on different high-fidelity face-swapping attacks on FaceForensics++, CelebaMegaFS, FFHQ-E4S, and Celeb-DFv2 datasets, which demonstrate its superior recovery performance over state-of-the-art deepfake recovery algorithms. In addition, DFREC is the only scheme that can recover both pristine source and target faces directly from the forgery image with high fidelity.

Abstract:
The feverish personal data gold rush has made sensitive information leakage a non-negligible issue, turning it into a popular target for malicious attacks. Although some data may not seem to reveal private information directly, they could still be exploited through malicious intervention and inference; such data are referred to as risk data. Studies have claimed that such risk data exhibit the trait of macro-level collaborative leakage, meaning that individually harmless risk data can reveal sensitive information when combined. However, why the macro-level collaborative leakage will occur remains relatively uncharted. Hence, in this paper, we conduct rigorous quantitative analyses for the first time, to trace the root of the macro-level collaborative leakage. We conclude that this phenomenon arises from the collaborative effects among pieces of risk data concerning sensitive information. In light of this, we formulate the sufficient condition for the occurrence of the macro-level collaborative leakage and investigate its presence in the Gaussian-distributed data. We highlight that, on the one hand, the Gaussian distribution can align the correlation between risk data and sensitive data at both the micro- and macro- levels, thereby preventing the macro-level collaborative leakage. This reveals the potential of the Gaussian distribution in enhancing data privacy protection from a macro perspective. On the other hand, the ability of the Gaussian distribution to counteract the macro-level collaborative leakage is inherently limited, which further corroborates the ubiquity of this phenomenon. Our insights underscore the need for more comprehensive security and privacy protection mechanisms to ensure data security and confidentiality.

Abstract:
Semantic Communication (SC) backdoor attacks aim to utilize triggers to manipulate the system into producing predetermined outputs via backdoored shared knowledge. Current SC backdoors adopt monomorphic paradigms with single attack target, which suffers from limited attack diversity, efficiency, and flexibility in heterogeneous downstream scenarios. To overcome the limitations, we propose SemBugger, a polymorphic SC backdoor. By dynamically adjusting the trigger intensity, SemBugger finely-grained controls over the SC knowledge to generate diverse malicious results from the system. Specifically, SemBugger is realized through a multi-effect poisoning-training framework. It introduces graded-intensity triggers to poison training data and optimizes SC systems with hierarchical malicious loss. The trained system’s knowledge dynamically adapts to trigger intensity in inputs to yield target outputs, all while preserving transmission fidelity for benign samples. Moreover, to augment SC security, we propose a provable robustness defense that resists SemBugger’s homogeneous attacks through a controlled noise mechanism. It operates via strategically adding noise in SC inputs, and we formally provide a theoretical lower bound on the defense efficacy. Experiments across diverse SC models and benchmark datasets indicate that SemBugger attains high attack efficacy while maintaining the regular functionality of SC systems. Meanwhile, the designed defense effectively neutralizes SemBugger attacks.

Abstract:
Chain-of-Thought (CoT) enhances an LLM’s ability to perform complex reasoning tasks, but it also introduces new security issues. In this work, we present ShadowCoT, a novel backdoor attack framework that targets the internal reasoning mechanism of LLMs. Unlike prior token-level or prompt-based attacks, ShadowCoT directly manipulates the model’s cognitive reasoning path, enabling it to hijack multi-step reasoning chains and produce logically coherent but adversarial outcomes. By conditioning on internal reasoning states, ShadowCoT learns to recognize and selectively disrupt key reasoning steps, effectively mounting a self-reflective cognitive attack within the target model. Our approach introduces a lightweight yet effective multi-stage injection pipeline, which selectively rewires attention pathways and perturbs intermediate representations with minimal parameter overhead (only 0.15% updated). ShadowCoT further leverages reinforcement learning and reasoning chain pollution (RCP) to autonomously synthesize stealthy adversarial CoTs that remain undetectable to advanced defenses. Extensive experiments across diverse reasoning benchmarks and LLMs show that ShadowCoT consistently achieves a state-of-the-art average Attack Success Rate of 91.2% (peaking at 94.4%) and a Hijacking Success Rate of 84.9% while preserving benign performance. These results reveal an emergent class of cognition-level threats and highlight the urgent need for defenses beyond shallow surface-level consistency.

Abstract:
Text-Based Person Retrieval (TBPR) aims to retrieve pedestrian images based on natural language description. A key challenge lies in learning generalized and discriminative cross-modal representations for accurate alignment between modalities. Unlike previous methods that promote global fine-grained representations through auxiliary tasks, we focus on learning many-to-many relationships between patch features to enable precise cross-modal correspondence. To this end, we propose a novel ALIGNER method to enhance both generalization and fine-grained cross-modal alignment. ALIGNER learns fine-grained cross-modal correspondences through three complementary components: 1) Discriminative Correspondence Learning captures fine-grained semantic relations by dynamically establishing discriminative token-level alignments between visual and textual modalities, enabling precise local matching; 2) Cross-modal Consistency Learning facilitates global alignment by modeling semantic consistency across modalities through an attention-driven interaction mechanism, effectively bridging high-level feature representations; 3) Granular Feature Modeling enhances the robustness of learned representations by injecting local structural awareness during training, allowing the model to preserve fine-grained details across modalities better. Together, these components promote both generalization and fine-grained alignment. Extensive experiments on CUHK-PEDES, ICFG-PEDES, and RSTPReid demonstrate that our method achieves state-of-the-art performance in both standard and domain generalization settings. Our code is available at https://github.com/cceinhorn/ALIGNER

Abstract:
Model Inversion Attacks (MIAs) can recover private training data by accessing model weights or outputs, posing significant threats to user privacy. Existing defenses cannot provide comprehensive protection against attackers with varying levels of knowledge and often lack evaluation against the most advanced attacks. Moreover, current defenses primarily focus on regularizing latent representations or labels. While prior work has explored input-level defenses (e.g., Random Erasing), these approaches are typically limited to simple transformations, and more complex or systematically combined input-level defenses remain underexplored. As the most common input-level perturbation technique, data augmentation applies transformations like cropping directly to input images. However, finding augmentations or their combinations that achieve a good privacy-utility trade-off is challenging, as it is impossible to evaluate every augmentation exhaustively through attacks. To address this, we design privacy and utility assessments for efficient evaluation and propose a Defense via Auto-Augmentation Search (DAAS). DAAS can automatically assess and identify candidates with strong privacy-utility trade-offs from a large augmentation pool. The final search results can then be leveraged for privacy-preserving training against MIAs. We evaluate DAAS across various models, datasets, and attacks, demonstrating superior defense performance compared to existing methods. Extensive ablation studies further demonstrate the effectiveness of DAAS.

Abstract:
SRAM Physically Unclonable Functions (PUFs) derive secret keys from start-up values for inherent security benefits but suffer from reliability issues due to bit flipping. We introduce the Threshold-based Majority Voting Scheme (TMVS), a lightweight method that eliminates noise and mitigates bias in SRAM PUFs while retaining the simplicity of majority voting decoders used by repetition codes, without the significant entropy loss that repetition codes incur under biased responses. TMVS runs entirely in software, requires no cell-level bit-error rate qualification or SRAM redesign, and avoids the complex decoders of heavy error correcting codes. We derive closed-form expressions for decoding-error probability and expected memory, validate them on experimental data, and present a security analysis that provides exact formulas for min-entropy and secrecy leakage due to helper data and bias, identifying conditions under which TMVS achieves zero secrecy leakage. On a large public dataset, TMVS shows near-zero cross-chip secrecy leakage and preserves average conditional min-entropy above 1 bit despite biased, spatially correlated SRAM statistics. Compared with prior work, TMVS offers the smallest decoding complexity at the cost of a larger PUF size. In a representative configuration, TMVS generates a 128-bit key with failure probability \mathbf 9.15 \cdot 10^-7 and zero secrecy leakage at a bit-flip probability of 10%, requiring only ～ \mathbf 248 k clock cycles on a 32-bit ARM Cortex-M0. These results show that TMVS is practical and implementation-friendly for resource-constrained, low-power devices.

Abstract:
Most skeleton-based unsupervised action recognition methods rely on centralized learning, which poses privacy risks when applied to human-related data. Federated learning (FL) is widely recognized for preserving data privacy but still struggles with data heterogeneity. Data condensation, as a representative approach, effectively mitigates heterogeneity while safeguarding client privacy. Existing data condensation methods in FL mainly focus on clients, while overlooking two critical issues that arise during global model training on the server. First, the limited data availability elevates the risk of overfitting in the global model. Second, information loss induced by data condensation results in performance degradation. To overcome these limitations, we propose Fed-C&E, featuring a closed-loop condensation-expansion paradigm where client-side condensation is followed by server-side expansion via a dual-level expansion mechanism. During the expansion stage, we present a novel prototype-to-sequence similarity transformation matrix pool to synthesize more samples that align with client data distributions. Furthermore, to mitigate the inherent semantic sparsity of skeletal data and condensation-induced information loss, we devise a feature expansion strategy that leverages second-order client statistics to supplement global information, where the expanded features serve as novel supervisory signals for contrastive learning. Extensive experiments across multiple datasets demonstrate that Fed-C&E not only outperforms aggregate-then-adapt FL methods but also effectively preserves data privacy.

Abstract:
With the advent of blockchain, on-chain crimes such as phishing, fraud, and cryptocurrency heist can be severe. Stealthy on-chain crimes tend to adopt Advanced Persistent Threat (APT) tactics to avoid detection, characterized by long-term persistence and ever-evolving crime patterns. However, existing forensic approaches struggle with scalability as the time span of on-chain crimes grows. To address this limitation, we propose BlockAthena, the first scalable forensic framework for long-term on-chain crime analysis in account-based blockchains. We first observe that real-world on-chain crimes tend to exhibit botnet-style behaviors and APT-like life-cycles in the long-term perspective, characterized as co-occurrence transactional behaviors and latent periodicity (e.g., crime preparation, exploitation, and propagation stages). Inspired by these insights, BlockAthena segments long-term transaction topology into semantically complete subgraphs based on crime periodicity (i.e., evolution periods) and models both direct and co-occurrence transactional behaviors within segment subgraphs, enhancing memory efficiency while preserving key behavioral traits. Specifically, BlockAthena consists of three key components: 1) a Motif-aware Periodicity Modeling (MPM) module that performs joint analysis in the wavelet-topology domain to extract crime evolution periods; 2) a mixed-order behavior profiler that captures fine-grained temporal dynamics and botnet-style co-occurrence transactions via dynamic graph mining and hypergraph modeling; and 3) an Evolution-aware Residual Aggregator (ERA) that synthesizes long-term patterns across evolution periods using residual connections. Extensive experiments validate the effectiveness and scalability of BlockAthena, achieving an average 18% improvement in F1-score and up to an 80% reduction in memory overhead compared to the best-performing baseline. The real-world case study further demonstrates its capability to uncover APT-style stealthy tactics in long-term on-chain crimes.

Abstract:
Website fingerprinting (WF) attacks identify Tor-encrypted websites but struggle with cross-domain scenarios due to traffic distribution shifts. The existing few-shot WF attacks address the cross-domain problem with excessive auxiliary data, significantly reducing deployment efficiency. This work proposes UDA-WF, a data-efficient few-shot WF with Unsupervised Domain Adaptation (UDA). UDA-WF first pre-trains the feature extractor with limited auxiliary data in the source website domain. Then, it extracts the invariant feature space by computing the intersection of the source and target feature spaces through the unsupervised domain adaptation with the softmatch mechanism. Finally, UDA-WF fine-tunes the feature extractor and a single-layer perceptron to extract the discriminative unique feature space of the target website domain. We evaluate UDA-WF on our WF dataset collected over multiple months. UDA-WF significantly overcomes the cross-domain problem while reducing auxiliary data requirements by 95% and pre-training bootstrap time by 99% compared to the State-Of-The-Art (SOTA) methods. UDA-WF achieves an accuracy of 97.37% under the 20-shot setting in the closed-world scenario and outperforms SOTA methods. To further demonstrate the model’s adaptability to diverse real-world requirements, we validate it on the DF and Wang datasets, achieving accuracies exceeding 92% and 94%, respectively. Moreover, the results show that our UDA-WF is more resilient to concept drift and robust to WF defense.

Abstract:
Neural network-based steganography has garnered considerable attention for its strong security. However, existing approaches often suffer from excessive coupling between the embedding and extraction networks: the sender and receiver must employ paired models and maintain strict synchronization. Such synchronization not only complicates deployment but also introduces more severe potential risks of information leakage. To overcome this limitation, we propose a synchronization-free steganographic framework based on decoupled neural embedding networks, following the destruction-restoration principle. In our design, message embedding is realized through a destruction operation, while recovery is achieved using a neural network from the audio restoration domain. This decoupled architecture allows the sender to upgrade, replace, or randomize the embedding network—thus enabling dynamic model changes—without impairing the receiver’s ability to correctly extract the hidden message. As a result, synchronization-related vulnerabilities are fundamentally eliminated. Experimental results demonstrate that even under dynamic changes in the embedding network, the hidden information can still be reliably extracted, confirming both the effectiveness and enhanced security of the proposed approach.

Abstract:
With the increasing deployment of resource-constrained devices in daily life, ultra-lightweight ciphers become a necessity to tackle the security and privacy concerns in resource-constrained devices. In 2023, Gül and Kara studied the question of how to design a secure ultra-lightweight stream cipher with a small internal state, and introduced a new small-state stream cipher called DIZY. The cipher utilizes Truncated Pseudorandom Permutations (TPP) and has a provable security in the indistinguishability model. It consists of two versions, called DIZY-128 with a 128-bit key and DIZY-80 with an 80-bit key, respectively. In this paper, effective key recovery attacks on DIZY-80 and DIZY-128 are proposed. Both attacks leverage the weakness of DIZY that the attacker can easily reach a weak state in the middle of the initialization using chosen IVs. Based on constructing Hellman tables, the key recovery attacks on DIZY-80 and DIZY-128 are further improved. The cryptanalytic results show that DIZY-80/DIZY-128 can only provide a 65/86-bit security level against the key recovery attack, while it is claimed to provide an 80/112-bit security level by the designers. Finally, an improved variant of DIZY, called DIZYa, is proposed. The analysis on DIZYa shows that the improved variant can provide better security resistance against all known attacks including our attacks on DIZY, while maintaining the commendable characteristics of DIZY. This makes DIZYa a more suitable small-state stream cipher choice for resource-constrained devices like RFID tags.

Abstract:
Federated learning is a privacy-preserving distributed learning paradigm in which a server coordinates multiple clients to train a global model. However, current federated optimization introduces bias by favoring the interests of specific clients, overlooking the concerns of vulnerable participants to maximize global benefits. Many efforts have been made in the pursuit of fairness for this shortcoming, yet we notice that such endeavors exhibit extremely poor robustness. A minimal amount of malicious tampering is sufficient to disrupt convergence. In fact, we recognize a subtle trade-off between robustness and fairness, which remains an open question. To address these concerns, we propose FairRoP, a systematic strategy that enhances fairness and guarantees robustness with adaptive client selection. We model complex multi-objective optimization problems using a simple and efficient \epsilon -greedy Thompson Sampling Multi-Armed Bandit (TS-MAB). At the core of this approach are three submodules: fairness awareness, attack detection, and q-Balance, each designed to tackle specific sub-problems within the broader optimization challenge. Our experimental results, conducted on real datasets, showcase that FairRoP significantly improves overall fairness and robustness compared to state-of-the-art solutions. Furthermore, our approach seamlessly integrates with other aggregation algorithms.

Abstract:
Blockchain-based electronic voting systems can achieve voter identity anonymity via cryptographic techniques such as ring signatures and blind signatures. However, fully hiding of voter information mitigates the capability of traceability. Previous mechanisms provide limited traceability, typically by preventing double-voting attacks while compromising anonymity. From the cryptographic point, threshold signature with private accountability seems to offer a balanced solution between privacy and accountability. If directly applying it into blockchain-based electronic voting systems, it needs to fix all voters and each verification has to pre-store all voters’ public keys, incurring at least linear-size storage overhead and poor scalability. How to optimize the storage and scalability while guaranteeing both anonymity and traceability remains to be challenging. In this paper, we propose BAVote, an efficient and scalable blockchain-based electronic voting system with anonymity and traceability. It is built on top of a new threshold signature scheme named ConsATS that features a constant-size verification key. ConsATS compresses all voters’ public keys into a single verification key, allowing an aggregated ballot to be verified without storing or processing per-voter public keys, thereby reducing on-chain storage overhead and improving scalability in BAVote. The aggregated signature serving as the ballot is encrypted in ConsATS, while BAVote further combines one-time addresses and a commit–reveal mechanism to protect intermediate on-chain data during voting. The corresponding tracing key enables authorized tracer to identify malicious voters during authorized audits. We implement a prototype of BAVote in both a local blockchain environment and the Ethereum Sepolia testnet. The experimental results show that the storage cost of verification keys in our system is reduced by more than 90% and the verification time is 7x faster compared to existing schemes.

Abstract:
Anonymous credential systems (ACs) enable users to selectively disclose attributes in an anonymous and unlinkable manner, but their misuse is hard to address without traceability. Existing traceable ACs introduce regulators to handle this issue, but they impose substantial communication and computational overhead on regulators, especially in complex scenarios involving large numbers of authentication records and multiple independent regulators. In this work, we propose Public-Key Encryption with Equality Test and Variable Public Generator (PKEET-VPG) and its verifiable version (VPV) for adapting to traceable ACs to support secure outsourceable record retrieval from regulators to service providers. The core technical idea is a session-user-level tracing key whose validity is bound to a single authentication session, thereby preventing the abuse of tracing keys. We formally prove that both schemes achieve OW-CCA2 security, and show that, when integrated into ACs, they preserve user anonymity, support non-frameability, and retain cross-session unlinkability. Theoretical analysis and comparative experiments demonstrate that our schemes can reduce the overhead of regulators at a moderate cost in the authentication phase. Furthermore, our schemes support session-level ciphertext deduplication, which may be of independent interest in some scenarios, such as anonymous voting or one-action-per-session authentication.

Abstract:
Model fingerprinting has emerged as a crucial mechanism for safeguarding the intellectual property of open-source models, offering a non-intrusive approach that requires no modifications to the protected model. However, our analysis reveals that existing fingerprinting techniques are fundamentally vulnerable to false claim attacks, wherein adversaries can fraudulently assert ownership over independent third-party models. We demonstrate that this vulnerability stems from the untargeted nature of current methods, which evaluate model similarity based on arbitrary sample outputs rather than alignment with a specific, predefined reference. To mitigate this vulnerability, we introduce FIT-Print, a targeted fingerprinting paradigm that actively counters false claim attacks. Specifically, FIT-Print leverages optimization to transform the fingerprint into a verifiable, targeted signature. Building upon this foundation, we propose two black-box fingerprinting methods, the bit-wise FIT-ModelDiff and the list-wise FIT-LIME, which utilize output distances and feature attributions as robust model signatures, respectively. Extensive evaluations across benchmark models and datasets show that our framework perfectly neutralizes false claim attacks (100% defense success rate) and eliminates false alarms on independent models (0.0%), all while maintaining a 100% ownership verification rate against diverse model reuse techniques.

Abstract:
Recent advancements in artificial intelligence hold ample potential for monitoring applications using surveillance cameras. However, concerns about privacy and model bias have made it challenging to utilize them in public. Although de-identification approaches have been proposed in the literature, aiming to achieve a certain level of anonymization (AN), most of them employ deep learning models that are computationally demanding for real-time edge deployment. This study revisits conventional AN solutions for privacy protection and real-time video anomaly detection (VAD) applications. We propose a lightweight adaptive AN for VAD (LA3D) that employs dynamic adjustment to enhance full-body privacy protection. We have evaluated privacy protection and VAD utility retention efficacy using several publicly available datasets to examine the strengths and weaknesses of different AN methods and highlight the promising leverage of our approach. Our experiment demonstrates that the LA3D enables substantial improvement in privacy AN without severely degrading VAD efficacy, outperforming conventional and deep learning approaches. Code: https://github.com/muleina/LA3D

Abstract:
Vertical federated learning (VFL) has made significant strides in enhancing data privacy and security for cross-silo applications. However, despite its benefits, VFL remains vulnerable to emerging security threats, particularly backdoor attacks. While most existing research on VFL backdoor attacks has focused on image and natural language processing tasks, the security of tabular data—commonly used in high-risk domains such as finance and healthcare—has been largely overlooked. In this paper, we introduce chamaeleon, a novel backdoor attack targeting VFL for tabular data. Our approach achieves two key advancements. First, to address the challenge of restricted label access in VFL, chamaeleon employs a two-step inference method to extract label information. This method combines a label classifier with a top- k confidence filtering mechanism, enabling the precise identification of target-label samples (i.e., backdoored samples) with a precision of approximately 99.85%. Second, to overcome the limitations of fixed trigger patterns, which can disrupt the semantic integrity of tabular data (e.g., altering “male” to “pregnant”), chamaeleon introduces a dynamic trigger design. Each backdoored sample is injected with a unique trigger, generated by a transformer-based model inspired by large language models, ensuring semantic consistency. Additionally, a one-on-two adversarial game is implemented to optimize the generator’s performance with limited training data. Extensive evaluations across six models and six datasets demonstrate the effectiveness of our proposed attack. We also examine various factors that could influence the attack success and systematically analyze potential defense mechanisms to mitigate this newly identified threat.

Abstract:
Existing adversarial attacks for face anti-spoofing predominantly assume that the model parameters are fully known, and often overlook the transferability of adversarial examples across different models and domains. Furthermore, they typically target relatively homogeneous architectures, relying primarily on basic deep learning models for anti-spoofing. To address these limitations, this paper proposes an illumination-based input transformation method for generating adversarial attacks. A liveness ablation module is introduced to suppress liveness-related cues in the input image prior to attack generation, thereby enhancing the adversarial strength of the crafted examples. Additionally, a random illumination transformation strategy is employed to increase domain divergence by altering illumination factors, which enriches the diversity of input samples and boosts the transferability of adversarial examples across different models and settings. Extensive experiments conducted on two public datasets demonstrate that the proposed method outperforms existing approaches in terms of physical-world transferability. Moreover, the liveness ablation module can be integrated with other attack strategies to furture improve their adversarial effectiveness.

Abstract:
The web3 applications have recently been growing, especially on the Ethereum platform, starting to become the target of scammers. The web3 scams, imitating the services provided by legitimate platforms, mimic regular activity to deceive users. However, previous studies have primarily concentrated on de-anonymization and phishing nodes, neglecting the distinctive features of web3 scams. Moreover, the current phishing account detection tools utilize graph learning or sampling algorithms to obtain graph features. However, large-scale transaction networks with temporal attributes conform to a power-law distribution, posing challenges in detecting web3 scams. To overcome these challenges, we present ScamSweeper, a novel framework that emphasizes the dynamic evolution of transaction graphs, to identify web3 scams on Ethereum. ScamSweeper samples the network with a structure temporal random walk, which is an optimized sample walking method that considers both temporal attributes and structural information. Then, the directed graph encoder generates the features of each subgraph during different temporal intervals, sorting as a sequence. Moreover, a variational Transformer is utilized to extract the dynamic evolution in the subgraph sequence. Furthermore, we collect a large-scale transaction dataset consisting of web3 scams, phishing, and normal accounts, which are from the first 18 million block heights on Ethereum. Subsequently, we comprehensively analyze the distinctions in various attributes, including nodes, edges, and degree distribution. Our experiments indicate that ScamSweeper outperforms SIEGE, Ethident, and PDTGA in detecting web3 scams, achieving a weighted F1-score improvement of at least 17.29% with the base value of 0.59. In addition, ScamSweeper in phishing node detection achieves at least a 17.5% improvement over DGTSG and BERT4ETH in F1-score from 0.80.

Abstract:
In recent years, significant advances in gait recognition have been seen, with many methods reporting high accuracy on certain datasets. However, domain shifts, such as distribution inconsistencies in viewpoint or clothing, can severely degrade the performance of these models on unseen target domains, hindering the widespread application of gait recognition. Some unsupervised domain adaptation (UDA) methods have been proposed to address this problem. However, these approaches require continual updates with target domain data, which is often difficult to obtain due to privacy concerns and deployment complexity. This paper presents GaitDG, a single-source domain generalization framework designed to enhance the generalization ability of gait recognition models for unseen target domains, requiring training on only one source domain without accessing target domain data. During training, GaitDG employs adversarial training to disentangle domain-specific and identity-specific features, enabling the discovery of latent sub-domains and the extraction of domain-invariant features. Furthermore, GaitDG supports the integration of data augmentation to diversify the source domain data. We also introduce a data augmentation method, Segmentation Model Transfer (SMT), to mitigate recognition performance degradation caused by variations in segmentation models. As a model-agnostic approach, GaitDG can directly enhance the cross-domain recognition performance of gait recognition models without altering their structure. Comprehensive experiments on widely used gait datasets demonstrate that GaitDG significantly improves the cross-domain recognition performance of several state-of-the-art gait recognition models.

Abstract:
Sharing relational databases is essential in today’s data-driven world for fostering collaboration, enhancing efficiency, and enabling real-time data access. However, privacy and copyright issues arise when sharing privacy-sensitive or valuable data. Additionally, high utility is required in shared data to enable accurate data mining and analysis. Entry-level differentially private fingerprinting schemes (DPFS) could address these concerns. In a DPFS, data can be securely shared without leaking original values while still supporting accurate analysis. Moreover, detectable fingerprints can deter unauthorized redistribution. However, existing DPFSs often lack utility—due to format changes and entry-wise bias—or robustness, as fingerprints can be removed undetected. In this paper, we propose an unbiased and robust differential privacy-based fingerprinting scheme (DPFS), which ensures that the fingerprinted copy remains an unbiased estimate of the original data. By incorporating differential privacy noise, our scheme effectively mitigates alteration, collusion, and hybrid attacks. Our DPFS satisfies \epsilon -entry-level differential privacy, enabling clients to conduct unbiased analysis. To improve robustness, we design group-based fingerprint detection, which estimates the mean of injected noise per group with error tolerance. We provide a theoretical robustness analysis and propose a method for achieving optimal robustness. Experiments on four real-world databases show that our scheme consistently detects fingerprints and improves accuracy by up to 20% on machine learning tasks compared to existing DPFSs.

Abstract:
Skeleton action recognition models have secured more attention than video-based ones in various applications due to privacy preservation and lower storage requirements. Skeleton data are typically transmitted to cloud servers for action recognition, with results returned to clients via Apps/APIs. However, the vulnerability of skeletal models against adversarial perturbations gradually reveals the unreliability of these systems. Existing black-box attacks all operate in a decision-based manner, resulting in numerous queries that hinder efficiency and feasibility in real-world applications. Moreover, all attacks off the shelf focus on only restricted perturbations, while ignoring model weaknesses when encountered with non-semantic perturbations. In this paper, we propose two query-effIcient Skeletal Adversarial AttaCks, ISAAC-K and ISAAC-N. As a black-box attack, ISAAC-K utilizes Grad-CAM in a surrogate model to extract key joints where minor sparse perturbations are then added to fool the classifier. To guarantee natural adversarial motions, we introduce constraints of both bone length and temporal consistency. ISAAC-K finds stronger adversarial examples on the \ell _\infty norm, which can encompass those on other norms. Exhaustive experiments substantiate that ISAAC-K can uplift the attack efficiency of the perturbations under 10 skeletal models. Additionally, as a byproduct, ISAAC-N fools the classifier by replacing skeletons unrelated to the action. We surprisingly find that skeletal models are vulnerable to large perturbations where the part-wise non-semantic joints are just replaced, leading to a query-free no-box attack without any prior knowledge. Based on that, four adaptive defenses are proposed to improve the robustness of skeleton recognition models.

Abstract:
The Link Flooding Attack (LFA) poses a significant threat to the network as a novel form of indirect, distributed denial-of-service attack. There is an urgent need for a defense capable of eliminating LFA. LFA operates as a closed-loop, continuously monitoring target links and adjusting traffic to adapt to network changes, thereby sustaining the attack. This implies that the effective defense against LFA should be feedback-control-based, enabling adaptive responses to evolving attack strategies. However, existing works for eliminating LFA operate using an open-loop way, which lacks feedback throughout the defense process, preventing them from efficiently detecting and blocking attack flows. In this paper, we adopt the concept of feedback control to propose Loop-Filter, which first filters part of the traffic on the attacked link and then refines the detected attack flows based on feedback from the link and flows to optimize traffic filtering. This process is repeated iteratively until the attacked link returns to normal. To implement the concept, we propose a blockchain-based collaboration defense architecture, along with economic incentives, to promote cooperation between domains. Meanwhile, we leverage reinforcement learning to learn how to rapidly and accurately adjust the detection results of flows based on the feedback. We also design a lossless compression method of filtering rules and dynamically deploy them to drop attack traffic maximally. Finally, we validate the effectiveness and efficiency of Loop-Filter under various traffic conditions.

Abstract:
Network traffic classification is crucial for both network security and management. Despite advances in deep learning-based multi-task traffic classification, existing models often struggle to jointly handle multiple tasks while providing interpretable insights. In multi-task scenarios, different tasks rely on distinct regions of the traffic sequence, motivating the use of dynamic and interpretable attention mechanisms. To this end, we propose Dynamic Quantized Self-Attention (DQSA), a unified framework specifically designed for multi-task network traffic classification. At its core, the Task Gated Attention Router (TGAR) dynamically associates attention heads with different tasks, enabling adaptive focus on task-specific patterns. This mechanism provides interpretable attention scores, which help analyze misclassifications and guide further model refinement. To improve efficiency and handle diverse network traffic features, we introduce the Soft Quantized Self-Attention Head (SQ-SAH) to reduce computational complexity and extend the Rotary Position Embedding (RoPE) to accommodate these features. Extensive experiments on ISCX VPN-nonVPN and DCI-LTE datasets demonstrate that DQSA consistently outperforms state-of-the-art baselines, achieving 92.85% accuracy on the encapsulation-level task of ISCX VPN-nonVPN and 93.17% accuracy on the application-level task of DCI-LTE, surpassing the strongest existing methods by up to 2.65%, while providing interpretable task-specific attention for efficient multi-task network traffic classification.

Abstract:
In recent years, Cross-Modal Retrieval (CMR), which can retrieve data across types based on query semantics, has become an attractive technology due to the widespread applications of multimedia data. Outsourcing multimedia data to a cloud server is a reliable way to improve the quality of CMR services, but it will also incur potential data privacy leakage issues. Existing schemes for privacy-preserving outsourced data search services are either inapplicable to CMR or limited by efficiency and scalability. To address the above issues, we investigate the problem of Searchable Symmetric Encryption (SSE) for CMR in this paper. Firstly, we formulate the definition of SSE for CMR (namely, \textsf SSE_\textsf CMR ) and extend the SSE leakage functions to capture the leakage in \textsf SSE_\textsf CMR . Then, by constructing distance-computation-free Hamming inverted multi-index, we propose a practical \textsf SSE_\textsf CMR construction. Specifically, we transform Hamming distance-based range queries into multi-key queries, thereby avoiding computationally expensive comparison operations and enabling efficient Hamming distance queries over encrypted data. Our design supports secure CMR with sub-linear complexity in a single communication roundtrip. Through rigorous security analysis, we demonstrate that our construction can provide adaptive security. Empirical evaluations on real-world datasets demonstrate that \textsf SSE_\textsf CMR outperforms the state-of-the-art scheme in both efficiency and accuracy, and is comparable to plaintext applications. Our code is available at https://github.com/XYWANGXDU/SSE-CMR

Abstract:
Cyber Threat Intelligence (CTI) and serverless computing are two emerging technologies that have significantly impacted their respective domains in recent years. However, their interaction remains surprisingly underexplored. In this work, through in-depth semi-structured interviews with cybersecurity experts, we identify the trust issues within the CTI ecosystem that can be exploited to introduce fake CTI manipulation, enabling indirect attacks against entities with dynamic IP allocation, such as those in serverless computing. Furthermore, these attacks can be amplified by commercial CTI platforms due to their widespread adoption and sharing mechanisms. Based on these insights, we propose Ares, a novel attack strategy that leverages fake CTI manipulation to enable large-scale, stealthy indirect denial-of-service attacks against serverless infrastructures. We demonstrate the feasibility and impact of Ares through extensive evaluations in a controlled experimental environment. Our results show that Ares can rapidly and widely disseminate fake CTI within the CTI ecosystem, leading to an overall average reject rate of 23.03% and a high reject rate of up to 45.42% when accessing top websites in certain industries, while maintaining a low detection rate across state-of-the-art serverless security systems. These findings underscore the urgent need for more frequent communication and collaboration among CTI platforms and related stakeholders to develop a more robust trustworthiness model across the ecosystem.

Abstract:
With rich temporal-spatial information, video-based person re-identification methods have shown broad prospects. Although tracklets can be easily obtained with ready-made tracking models, annotating identities is still expensive and impractical. Therefore, some video-based methods propose using only a few identity annotations or camera labels to facilitate feature learning. They also simply average the frame features of each tracklet, overlooking unexpected variations and inherent identity consistency within tracklets. In this paper, we propose the Self-Supervised Refined Clustering (SSR-C) framework without relying on any annotation or auxiliary information to promote unsupervised video person re-identification. Specifically, we first propose the Noise-Filtered Tracklet Partition (NFTP) module to reduce the feature bias of tracklets caused by noisy tracking results, and sequentially partition the noise-filtered tracklets into “sub-tracklets”. Then, we cluster and further merge sub-tracklets using the self-supervised signal from the tracklet partition, which is enhanced through a progressive strategy to generate reliable pseudo labels, facilitating intra-class cross-tracklet aggregation. Moreover, we propose the Class Smoothing Classification (CSC) loss to efficiently promote model learning. Extensive experiments on the MARS and DukeMTMC-VideoReID datasets demonstrate that our proposed SSR-C for unsupervised video person re-identification achieves state-of-the-art results and is comparable to advanced supervised methods. The code is available at https://github.com/Darylmeng/SSRC-Reid

Affiliations: Key Laboratory of Data and Intelligent System Security (DISSec), Ministry of Education, College of Cryptology and Cyber Science, and the Academy for Advanced Interdisciplinary Studies (AAIS), Nankai University, Tianjin, China; Key Laboratory of Data and Intelligent System Security (DISSec), Ministry of Education, College of Cryptology and Cyber Science, Nankai University, Tianjin, China; Tianjin Key Laboratory of Advanced Networking (TANKLab), College of Intelligence and Computing (CIC), Tianjin University, Tianjin, China

Abstract:
Website Fingerprinting (WF) attack can be mitigated through random camouflage or pair camouflage. Random camouflage inserts random dummy packets into the traces according to pre-defined rules. It can be compromised easily by machine learning-based WF attacks. Pair camouflage obfuscates the distinguishing features of paired websites by inserting elaborated perturbations into raw traces, thereby misleading the attacker. It is costly in maintaining a perturbation generator for each pair of websites. Based on these insights, we propose Stinger, a novel data poisoning based WF defense, which enables effective defense against WF attacks with low bandwidth overhead and only maintains one generator for all websites. Stinger exploits the idea of poisoning by contaminating the model directly in such a way that the WF attacks only classify based on the inserted poison sequences, thus being low overhead and website independent. We experimentally evaluate Stinger using the DF and AWF datasets. The results show that Stinger improves the successful defending rate by an average of 20.37% and 22.83% while reducing overhead by 85.88% and 81.35%, respectively.

Abstract:
This paper addresses the security and privacy issues associated with the global models in Federated Learning by proposing a new approach, called PDFL, which tackles the challenges of poisoning attacks and privacy leakage during FL training rounds. PDFL is based on secure multi-party computation and performs privacy-preserving cluster analysis on encrypted data from participants in order to identify malicious poisoning attackers. This approach involves a two-server mechanism and integrates four privacy-preserving protocols based on two-party computation (2PC): SecJudge for normalizing gradients, SecCosine for computing the cosine similarity values among gradients, SecClu for countering poisoning attacks, and SecAgg for secure aggregation by the server. These protocols are designed to achieve low computational costs, preserve client data privacy, and mitigate poisoning attacks from the potentially malicious clients. We provide a theoretical proof that our four sub-protocols and the PDFL scheme are both safe and reliable, demonstrating that PDFL can ensure the privacy and security of the participating data. Additionally, we conduct extensive simulation experiments to evaluate the accuracy, efficiency, computational overhead, and communication overhead associated with the PDFL scheme. Experimental results show the potential of the PDFL scheme in significantly enhancing the ability to identify malicious poisoning attackers in federated learning systems accurately and efficiently, hence making PDFL a promising solution for addressing privacy and security concerns in this domain.

Abstract:
Secure three-party computation (3PC) with semi-honest security under an honest majority offers notable efficiency in computation and communication; for Boolean circuits, each party sends a single bit for every AND gate, and nothing for XOR. However, round complexity remains a significant challenge, especially in high-latency networks. Some works can support multi-input AND and thereby reduce online round complexity, but they require exponential communication for generating the correlations in either preprocessing or online phase. How to extend the AND gate to multi-input while maintaining high correlation generation efficiency is still not solved. To address this problem, we propose a round-efficient 3PC framework Alkaid for Boolean circuits through improved multi-input AND gate. By mixing correlations and redundancy, we propose a concretely efficient correlation generation approach for small input bits N\lt 4 and shift the correlation generation to the preprocessing phase. Building on this, we create a round-efficient AND protocol for general cases with N\gt 4 . Exploiting the improved multi-input AND gates, we design fast depth-optimized parallel prefix adder and share conversion primitives in 3PC, achieved with new techniques and optimizations for better concrete efficiency. We further apply these optimized primitives to enhance the efficiency of secure non-linear functions in machine learning. We implement Alkaid and extensively evaluate its performance. Compared to state of the arts like ABY3 (CCS’2018), Trifecta (PoPETs’2023), and Meteor (WWW’2023), Alkaid enjoys 1.5× – 2.5× efficiency improvements for boolean primitives and non-linear functions, with better or comparable communication.

Abstract:
Unsupervised visible-infrared person re-identification (US-VI-ReID) seeks to learn a cross-modality retrieval model without relying on manual annotations, thereby reducing the high cost associated with labeling. Recent large-scale vision-language pre-training models, such as CLIP, have shown significant potential in enhancing pure-vision-based person re-identification. However, existing CLIP-based US-VI-ReID methods focus on independently learning semantic information within the visible and infrared modalities. These methods overlook the mismatch between the pre-training data of CLIP and the downstream cross-modality data, resulting in substantial cross-modal semantic differences. Such inconsistent semantic information, which exhibits modality discrepancies, cannot ensure the accuracy of cross-modality associations and thus hampers the performance of cross-modality learning. To address these challenges and further explore the generalizable semantic representation across modalities in CLIP, we propose a novel framework named Mining Cross-Modality Implicit Semantic Association (MCSA), which focuses on learning a modality-invariant implicit semantic space to enhance cross-modality associations and feature learning. The proposed method comprises two key modules: Modality-invariant Prompt Learning and GCNs-Driven Collaboration Alignment. Specifically, to enable CLIP to learn modality-invariant semantics, we integrate a random color augmentation branch into the visible stream for joint contrastive learning for mining generalizable semantic representations. This ensures the color generalization of the constructed implicit semantic prompts. Moreover, within the cross-modal invariant implicit semantic space, we utilize Graph Convolutional Networks (GCNs) to uncover more reliable cross-modal associations. By integrating information from images and semantic graphs, we jointly refine cross-modal correspondences, enabling the model to perform precise cross-modal feature learning. Extensive experiments conducted on the SYSU-MM01 and RegDB datasets demonstrate the effectiveness of the proposed MCSA. The code is available at https://github.com/liulekai123/MCSA

Abstract:
Text-to-image (T2I) diffusion models enable high-quality image generation conditioned on textual prompts. However, fine-tuning these pre-trained models for personalization raises concerns about unauthorized dataset usage. To address this issue, dataset ownership verification (DOV) has recently been proposed, which embeds watermarks into fine-tuning datasets via backdoor techniques. These watermarks remain dormant on benign samples but produce owner-specified outputs when triggered. Despite its promise, the robustness of DOV against copyright evasion attacks (CEA) remains unexplored. In this paper, we investigate how adversaries can circumvent these mechanisms, enabling models trained on watermarked datasets to bypass ownership verification. We begin by analyzing the limitations of potential attacks achieved by backdoor removal, including TPD and T2IShield. In practice, TPD suffers from inconsistent effectiveness due to randomness, while T2IShield fails when watermarks are embedded as local image patches. To this end, we introduce CEAT2I, the first CEA specifically targeting DOV in T2I diffusion models. CEAT2I consists of three stages: 1) motivated by the observation that T2I models converge faster on watermarked samples with respect to intermediate features rather than training loss, we reliably detect watermarked samples; 2) we iteratively ablate tokens from the prompts of detected samples and monitor feature shifts to identify trigger tokens; and 3) we apply a closed-form concept erasure method to remove the injected watermarks. Extensive experiments demonstrate that CEAT2I effectively evades state-of-the-art DOV mechanisms while preserving model performance. The code is available at https://github.com/csyufei/CEAT2I

Abstract:
With the rapid development of intelligent surveillance technology, the massive amount of multimodal data (e.g., videos, images, and text) has imposed higher demands on efficient information retrieval and security. Traditional single-modal retrieval methods struggle to meet practical requirements, making multimodal image-text retrieval a research hotspot in this field. Existing approaches, however, still face challenges in fine-grained semantic alignment and suffer from rigid matching mechanisms. To address these issues, this paper introduces SeaNcr, a novel framework that integrates cross-modal semantic entity alignment with non-correspondence reasoning. Our method constructs class-level entity representations enhanced by saliency-guided masking to capture discriminative semantic features. A pseudo-frozen asynchronous optimization strategy is introduced to maintain semantic consistency across modalities by associating stable entity representations with dynamically updated encoder features. Moreover, to overcome rigid matching, we design a non-correspondence reasoning module that jointly leverages intra-modal similarity and cross-modal mutual nearest neighbor constraints, optimizing matching flexibility and generalization. Extensive experiments validate that SeaNcr significantly enhances cross-modal feature representation and retrieval robustness, achieving state-of-the-art performance on multiple person re-identification benchmarks.

Abstract:
Multimodal language models (LMs) have shown significant potential for applications across various domains but remain vulnerable to adversarial attacks. Current research in white-box or black-box settings generally struggles with unrealistic attack assumptions and limited efficacy of targeted attacks. This paper introduces CoGA, a novel gray-box collaborative adversarial attack method for multimodal LMs. Under our gray-box settings, attackers have access only to the victim model’s input encoders. With the guidance of different modalities, we perturb the embedding representations from encoders to disrupt the semantic alignment across modalities, ultimately causing inaccurate outputs on various downstream tasks. Specifically, we integrate text embeddings into the loss calculations of the image attack and utilize image embeddings to guide the ranking of vulnerable words and the selection of final samples. Extensive experiments demonstrate that our method achieves superior attack performance across diverse models and tasks, suggesting the shared vulnerability of multimodal LMs in confronting adversarial challenges. Our work provides new insights into the security of multimodal LMs, facilitating the deployment of more robust and secure models in practical applications.

Abstract:
Cloud storage has become the most attractive way to achieve data sharing by setting flexible access policies. Cryptographic tools are considered the most popular approach to protecting the privacy of data stored on the cloud. Dividing data into different classes plays a significant role in cloud storage, making data organization more methodical and data sharing more expressive and efficient. Unfortunately, current data sharing solutions either neglect data classification or suffer from data leakage. Specifically, shared keys can decrypt newly added encrypted data within the same class, and have key abuse issues where shared keys are untraceable once sold. In this work, we propose a time-bound data sharing system that addresses all these issues simultaneously. In our scheme, data is divided into different classes and encrypted according to its class and associated time period. Decryption keys for a set of chosen data classes can be aggregated into a single key, allowing users to decrypt multiple ciphertexts whose classes are within the set; while other encrypted data with classes outside the set remain confidential. Moreover, the aggregate key is time-bound which can only decrypt the ciphertexts generated before the embedded time period, ensuring it cannot access newly added encrypted data. The key size is independent of the chosen class set size and is only logarithmic in the bit length of the time period used. For each sharing, the shared aggregate key is different. In the event of data leakage or key selling, the data provider can identify the responsible users. We provide formal security analysis of our system and evaluate its performance through experiments. The results demonstrate that our system is highly efficient in terms of shared keys. It provides a practical solution for achieving efficient and dynamic data sharing in cloud storage.

Abstract:
Decentralized storage auditing approaches are designed to ensure data security in dishonest decentralized storage providers. However, the need for data updates introduces new challenges to the design of decentralized storage auditing approaches. Existing approaches can support dynamic auditing for updated files. Unfortunately, they can only deal with block-level updating, which is counter-intuitive and requires conversion from semantic changes to binary changes. Furthermore, existing dynamic auditing approaches require the recalculation of auxiliary auditing information (e.g., auditing authenticators) in data owners, which imposes unnecessary additional burdens on data owners, particularly those with constrained resources in decentralized storage environments. In this paper, we focus on image files and propose \textsf iAudit , an efficient pixel-level dynamic image auditing approach in decentralized storage. We first design a novel image authenticator with image pixels for efficient dynamic auditing, which combines convolution operations and polynomial commitment in authenticator construction. Additionally, we build an owner-free dynamic mechanism in dynamic decentralized storage auditing approach by utilizing zero-knowledge proof techniques. In this way, the dynamic operation overheads incurred by auditing can be completely eliminated from the data owners. A prototype of \textsf iAudit is implemented, and extensive experimental results demonstrate that \textsf iAudit outperforms state-of-the-art works, achieving over a 210 × speedup for data owner in dynamic update phase.

Abstract:
Blockchain plays a crucial role in ensuring the security and integrity of decentralized systems, with the proof-of-work (PoW) mechanism being fundamental for achieving distributed consensus. As PoW blockchains see broader adoption, an increasingly diverse set of miners with varying computing capabilities participate in the network. In this paper, we consider the PoW blockchain mining, where the miners are associated with resource uncertainties. To characterize the uncertainty computing resources at different mining participants, we establish an ambiguous set representing uncertainty of resource distributions. Then, the networked mining is formulated as a non-cooperative game, where distributionally robust performance is calculated for each individual miner to tackle the resource uncertainties. We prove the existence of the equilibrium of the distributionally robust mining game. To derive the equilibrium, we propose the conditional value-at-risk (CVaR)-based reinterpretation of the best response of each miner. We then solve the individual strategy with alternating optimization, which facilitates the iteration among miners towards the game equilibrium. Furthermore, we consider the case that the ambiguity of resource distribution reduces to Gaussian distribution and the case that another uncertainties vanish, and then characterize the properties of the equilibrium therein along with a distributed algorithm to achieve the equilibrium. Simulation results show that the proposed approaches effectively converge to the equilibrium, and effectively tackle the uncertainties in blockchain mining to achieve a robust performance guarantee.

Abstract:
Wireless Channel-based Secret Key Generation (WC-SKG) offers a promising alternative for wireless communication security, yet suffers from an extremely low key generation rate (KGR) in quasi-static environments (e.g., indoor Internet of Things), where key refresh cycles can extend for hours, creating a security vulnerability. Reconfigurable Intelligent Surfaces (RIS) can boost KGR by introducing artificial randomness, yet existing schemes only randomize the RIS phase shifts without using environmental priors, failing to maximize the KGR and meet the communication signal-to-noise ratio (SNR) requirement, causing service outages. To overcome these limitations, we propose the RIS-assisted Integrated Communication and WC-SKG (RICK) scheme. First, we treat the RIS-adjusted channel as a designable random variable and derive its optimal probability density function (PDF) using a novel geometric-algebraic framework to maximize the KGR under the communication SNR constraint. Second, we design a constrained-clustering-based quantization region division scheme tailored to this optimal non-uniform PDF, guaranteeing uniformly distributed secret keys. Simulation results show RICK achieves a KGR approximately 3.5-5 times higher than the state-of-the-art scheme while saving at least 20 dB of transmit power for the same KGR, confirming its effectiveness in quasi-static scenarios.

Abstract:
Blockchain-enabled Internet of Medical Things (BIoMT) systems require secure and anonymous authentication. However, existing mechanisms rely on classical cryptography, which becomes vulnerable to quantum attacks. This creates a critical need for post-quantum secure authentication that can preserve anonymity while remaining lightweight for large-scale deployments. To address this gap, we propose a module-lattice based Post-Quantum Aggregate Blind Signature (PQ-ABS) scheme that combines message blindness, signature aggregation, and Module-LWE hardness to achieve anonymous and quantum-resistant authentication. The scheme integrates with a lightweight blockchain architecture in which multiple signatures from distributed medical entities are aggregated into a single compact proof, significantly reducing verification overhead as the number of nodes increases. Formal analysis demonstrates resistance against correctness, unforgeability, blindness, unlinkability, and its resilience against quantum polynomial-time (QPT) adversaries under Module-SIS and Module-LWE assumptions. A full implementation on Hyperledger Fabric shows that, under growing network size, proposed PQ-ABS framework reduces verification latency by up to 71%, improves throughput by 62%, and maintains stable performance as the blockchain scales, confirming both its security and efficiency for real-time BIoMT environments.

Abstract:
Malicious insiders who possess system access and security expertise are notoriously difficult to detect and can inflict severe financial damage. While recent advances in deep learning have demonstrated impressive accuracy in detecting insider threats, these models often assume the presence of well-defined or previously known anomalies. In practical organizational environments, however, threats may manifest as novel, subtle, or context-dependent behaviors that are not captured by existing patterns. Detecting such anomalies necessitates the extraction and analysis of rich behavioral features from large-scale insider activity data—an approach that, while effective, often leads to increased model complexity and computational burden. This, in turn, impedes real-time responsiveness and operational viability, potentially resulting in delayed threat mitigation and financial losses. Therefore, there is a pressing need for lightweight yet robust insider threat detection frameworks that can ensure timely and efficient deployment without compromising detection performance. To address this challenge, this paper proposes the Insider-Specific Feature Learning Autoencoder (ISFL-AE), a model designed to achieve high detection accuracy and fast processing speed. Unlike traditional reconstruction-based anomaly detection models—which use a single set of model parameters to reconstruct normal behavior for all insiders regardless of their role, authority level, or other attributes—ISFL-AE tailors its feature learning to insider-specific characteristics. ISFL-AE operates with the same number of parameters as a conventional autoencoder (AE), maintaining comparable processing speed while significantly improving detection performance. We evaluated ISFL-AE using the CERT r4.2 and r6.2 datasets. The results show that, while processing data at the same speed as a standard AE, ISFL-AE delivered markedly higher detection accuracy. Furthermore, it outperformed other machine learning models in detection accuracy and processing speed. Furthermore, our empirical results demonstrate that integrating insider-specific feature learning into autoencoder-based deep learning architectures significantly enhances anomaly detection performance, all while preserving real-time processing efficiency.

Abstract:
Diffusion models have made tremendous progress in generating visually realistic images. However, these images are statistically different from the real images, which could be accurately classified by carefully designed detectors. To evade detection, researchers have proposed various adversarial example generation schemes for AI-generated images. Despite the progress, most of these schemes have the tendency to post-process the images and the distortion is inevitable. In this paper, we propose an Adversarial Diffusion Model (ADM), which is able to directly generate high quality and undetectable images from scratch on top of a pre-trained stable diffusion model. In the ADM, an adversarial denoising U-Net is proposed for searching an adversarial latent. This latent is helpful for generating a prompt consistent adversarial example which is able to deceive the detector. Then, we propose a latent compensation module to make the adversarial examples have a similar reconstruction error to that of the real images. We further propose an adversarial decoder to minimize the difference between the high-frequency components of the real and adversarial examples. Comprehensive experiments are carried out to demonstrate the advantages of our ADM in generating adversarial AI-generated images.

Abstract:
The emergence of automated tools (e.g., polymorphic and metamorphic engines, packers, and genetic programming) has triggered an explosive proliferation of malware and its variants, posing a significant cybersecurity threat. To effectively address the urgent challenge of precise malware detection and classification, enabling security researchers and incident responders to promptly implement defensive measures against malware intrusions and mitigate associated damages, researchers have developed techniques such as API call analysis, permission analysis, opcode analysis, and visual image texture analysis. However, these approaches often exhibit platform-specific limitations, require specialized knowledge of PE/APK file formats, assembly languages, or low-level operating system mechanisms, and lack cross-platform compatibility. This paper introduces PyraMal, a novel malware visualization framework featuring a pyramid-structured feature distribution. PyraMal models binary bytes as a Markov chain, computes a state transition frequency matrix, and applies the Dual-threshold Truncation - Log Hierarchical Transformation Algorithm (DT-LHTA) to construct a pyramid-structured feature map for malware detection and classification. Extensive experiments conducted on six cross-platform malware benchmark datasets—CICAndMal2017Det, CICMalDroid2020Det, Malimg, BIG-2015, CICMalDroid2020Cls, and MOTIF—demonstrate that the proposed method outperforms state-of-the-art approaches while exhibiting robustness against malware variants employing evasion techniques (e.g., packers, encryption obfuscation). Notably, the method demonstrates superior aging resistance for Android malware detection. Ablation studies further confirm that the stacking of pyramid-structured feature maps (via DT-LHTA) significantly enhances malware detection and classification performance.

Abstract:
Federated Learning (FL) enables collaborative model training across distributed participants without sharing raw data, offering a privacy-preserving paradigm. However, recent studies on gradient inversion attacks have demonstrated the vulnerability of FL to adversaries who can reconstruct sensitive local training data from shared gradients. To mitigate this threat, we propose Gradient Dropout, a novel defense mechanism that disrupts reconstruction attempts while preserving model utility. Specifically, Gradient Dropout perturbs gradients by randomly scaling a subset of components and replacing the remainder with Gaussian noise, thereby creating a transformed gradient space that significantly impedes reconstruction attempts. Moreover, this mechanism is applied across all layers of the model, ensuring that attackers cannot exploit any unperturbed gradients. Theoretical analysis reveals that the perturbed gradients can be kept sufficiently distant from their true values, thereby providing safety guarantees for the proposed algorithm. Furthermore, we demonstrate that this protection mechanism minimally impacts model performance, as gradient dropout and the original training dynamics remain effectively bounded under certain convexity conditions. These findings are substantiated through experimental evaluations, where we show that various attack methods yield low-quality reconstructed images while model performance is largely preserved, with less than 2% accuracy reduction relative to the baseline. As such, Gradient Dropout is presented as an effective solution for safeguarding privacy in FL, providing a balanced trade-off between privacy protection, computational efficiency, and model accuracy.

Abstract:
With the widespread use of smartphones, malware has posed serious threats to their security, making its detection of utmost concern. To combat the evolving malware attacks, deep learning-based methods have been successfully developed in practical applications due to their strong generalization and unparalleled flexibility in automatic malware detection. However, recent studies have shown that the highly complex transformations of machine learning models, the general unverifiability caused by compound structures, and the unexplainability of predictions have enabled the attackers to carry out inference of the models, which has led to the creation of adversarial samples. Therefore, recent research has concentrated on the key areas of defense against adversarial attacks such as malicious detection. This paper introduces NetAED, a framework for reactive defenses against malware attacks based on adversarial examples, which neither modifies the deployed classifier nor requires knowledge of the process for crafting adversarial examples. In NetAED, we propose a Random Cross-Region Feature Perturbation mechanism and employ non-linear quantization to alleviate the impact of adversarial examples. We further develop ARNDroid, a malware detection system against adversarial examples, which integrates the proposed NetAED. Promising experimental results based on real-world datasets demonstrate that ARNDroid typically provides superior classification performance and robustness to white-box attacks compared with state-of-the-art approaches.

Abstract:
TinyML models deployed on edge devices are increasingly adopted in safety/security-critical applications, making them a prime target for adversarial example (AE) attacks where inputs are modified to cause misclassifications. However, existing AE detection methods either require white-box model access, which is often unavailable in licensed black-box deployments, or rely on input pre-processing stages that add non-trivial latency and resource overhead, often exceeding what mission-critical applications can afford on their inference path. To address these challenges, we propose AdvScan, a runtime power analysis-based methodology for AE detection that operates in a black-box scenario and while inducing minimal latency. AdvScan is based on the observation that AEs produce anomalous neuron activations, which, in turn, generate distinctive power-consumption signatures. The algorithm initially constructs a baseline distribution of power signatures from known benign inputs, then, at runtime, applies a one-sample t-test to determine whether a test input’s power signature significantly deviates from this baseline, thereby detecting AEs. We evaluated AdvScan using three adversarial example (AE) generation algorithms (Fast Gradient Sign Method (FGSM), Projected Gradient Descent (PGD), and Carlini–Wagner (C&W)) on three MLPerf Tiny benchmark models implemented on two target devices: the STM32F303RC (ARM Cortex-M4) and STM32L562RE (ARM Cortex-M33) microcontrollers. Across 318,400 total test inputs, AdvScan detects 99.984% of AEs with only 40 false negatives and zero false positives. These results demonstrate the viability of power-based AE detection for secure, accuracy-critical TinyML deployments in black-box environments.

Abstract:
Model extraction (ME) attacks pose a growing security risk to machine learning systems, which allows adversaries to replicate functionality of a cloud model through query-response pairs, undermining intellectual property of companies. Previous ME attacks have predominantly targeted cloud-based models, leaving the threats to vision-based systems in the physical world unexplored, e.g., extracting the model functionality from an autonomous vehicle. In this paper, we present PROTheft, an ME attack extended to the physical domain. Specifically, PROTheft targets real-world vision-based devices that provide users with black-box access only. Leveraging a projector, PROTheft can be effortlessly executed by any user of the target device. The projector is positioned in front of the on-board camera to replicate real-world scenarios, effectively mapping digital-domain attack samples into the physical domain. To adapt digital ME attacks to the physical domain, we address the problem of misleading annotations caused by the detail loss during digital-to-physical-to-digital (DPD) transformation, wherein digital inputs are projected and then captured as on-board camera inputs. Specifically, we develop a simulation module that evaluates the effectiveness of digital attack samples in the physical domain, enabling a more accurate sample assessment and ultimately improving the overall attack performance. We evaluate PROTheft on a public autonomous driving dataset, achieving over 80% fidelity with the target model and an mAP 50 above 0.85.

Abstract:
Transfer-based attacks craft adversarial examples on white-box surrogate models and directly deploy them against black-box targets, posing practical query-free threat scenarios. While flatness-enhanced methods have recently emerged to improve transferability by smoothing the loss surface of adversarial examples, their divergent flatness definitions and heuristic attack designs suffer from unexamined optimization limitations and a missing theoretical foundation, thereby constraining their effectiveness and efficiency. This work exposes the severely imbalanced exploitation-exploration dynamics in flatness optimization, establishing the first theoretical foundation for flatness-based transferability and proposing a principled framework to overcome these optimization pitfalls. Specifically, we systematically unify fragmented flatness definitions across existing methods, revealing their optimization limitations: either over-exploration of sensitivity peaks or over-exploitation of local plateaus. To resolve these issues, we rigorously formalize average-case flatness and transferability gaps, proving that enhancing zeroth-order average-case flatness minimizes cross-model discrepancies. Building on this theory, we design a Maximin Expected Flatness (MEF) attack that enhances zeroth-order average-case flatness while balancing flatness exploration and exploitation. Extensive evaluations across 33 models and 43 current transfer-based attacks demonstrate MEF’s superiority: it surpasses the state-of-the-art PGN attack by 4% in attack success rate at half the computational cost and achieves an 8% higher success rate under the same budget. When combined with input augmentation, MEF attains 15% additional gains against defense-equipped models, establishing new robustness benchmarks. Our code is available at https://github.com/SignedQiu/MEFAttack

Affiliations: School of Computer Science and the School of Software, Nanjing University of Information Science and Technology, Nanjing, China; School of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian, China; Information Engineering University, Zhengzhou, China; Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China; Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen, China; College of Cryptology and Cyber Science, Nankai University, Tianjin, China

Abstract:
The quantization step is a crucial parameter in the JPEG compression process, and provides prior knowledge for JPEG image steganography and forensics. Existing neural network-based methods typically estimate the quantization steps for all discrete cosine transform (DCT) subbands jointly, by treating the entire quantization table as a unified input and leveraging the inter-subband relationships. However, subband relationships vary across different quantization tables, leading to poor generalization for methods that rely heavily on such relationships. To address the above issues, we depart from the strategy that relies on inter-subband relationships and instead train the model on a specific single subband. To compensate for the possible decrease in accuracy due to the lack of relationships between subbands, we extract the ranking features and histogram features from the DCT coefficient histograms of the subbands. Ranking features capture local patterns in DCT histograms by modeling the relative relationships between neighboring coefficients, thereby compensating for the absence of local detail. On the other hand, histogram features represent the overall distribution pattern of the DCT coefficient histograms and capture the global trends and statistical properties in the subbands. We subsequently employ convolutional groups and multilayer perceptron (MLP) structures to extract compression artifacts from these two features. Finally, we introduce a comprehensive evaluation metric, called GenAQt, to quantify the algorithm’s generalization ability across quantization tables. The experimental results demonstrate that our method maintains high accuracy across quantization tables, with RelGenAQt (relative accuracy decrease) exceeding 81% and AbsGenAQt (absolute accuracy decrease) being less than 0.38.

Affiliations: College of Computer and Cyber Security, Fujian Normal University, Fuzhou, China; College of Computer and Cyber Security, the School of Mathematics and Statistics, the Key Laboratory of Analytical Mathematics and Applications (Ministry of Education), and Fujian Provincial Key Laboratory of Network Security and Cryptology, Fujian Normal University, Fuzhou, Fujian, China; School of Computer and Electronic Information, Nanjing Normal University, Nanjing, China; School of Mathematics and Statistics, Fuzhou University, Fuzhou, China; School of Information Science and Engineering, Zhejiang Sci-Tech University, Hangzhou, China

Abstract:
As cloud computing advances, data owners increasingly upload large volumes of data to the cloud. Attribute-based searchable encryption (ABSE) empowers data owners to manage fine-grained access over encrypted cloud files, and supports keyword-based search for authorized users. However, current multi-owner searchable encryption schemes often suffer from efficiency limitations and vulnerabilities to keyword guessing attacks. Furthermore, access policies are typically stored in plain form, exposing confidential details about data owners and authorized users. To tackle the aforementioned issues, we put forward an expressive attribute-based searchable encryption scheme with full policy concealment. Our design leverages the reduced ordered binary decision diagram (ROBDD) for access control targeting multi-user and multi-owner environments. In our scheme, users can flexibly select data owners and utilize a single trapdoor to search across shared datasets. The integration of a warrant server that signs obfuscated keywords prevents the cloud server from launching effective keyword guessing attacks. The adoption of ROBDD enables complex access policies via boolean operations, thereby significantly enhancing the efficiency and flexibility of access control. Full policy hiding is achieved by mapping ROBDD paths to an improved bloom filter, preventing access policy leakage. We present formal definitions and security models of the proposed approach, along with rigorous security proofs. Performance evaluation is conducted through theoretical analysis and simulations. Experimental indicate that our scheme achieves superior efficiency over state-of-the-art alternatives, offering a robust solution for secure and flexible cloud data management.

Abstract:
As cyberspace continues to expand, identifying the organization or individual behind a website has become increasingly vital in security incident response, phishing website detection, and other cybersecurity subfields. An existing solution for it involves analyzing webpage content and extracting owner names using named entity recognition techniques. However, since these techniques operate on a sentence-by-sentence basis, they struggle to identify the true owner when multiple individual or organizational names appear on a webpage. Moreover, they often perform poorly on non-English websites. To address these limitations, we propose OwnerHunter,1 a novel multilingual framework powered by large language models, which formulates website owner identification as a multilingual document-level information extraction task and utilizes global information from webpages to identify the owner. In OwnerHunter, we first craft prompts that fully leverage the capabilities of large language models to effectively recognize potential owners on webpages in different languages with minimal examples. To enhance the comprehensiveness and accuracy of recognition, we further design a multimodal augmentation strategy, an example pool strategy, and a self-verification strategy. Then, we devise a semantic and string similarity aggregation-based entity disambiguation technique to eliminate ambiguities among multiple potential owners recognized by large language models and a position-based hybrid ranking technique to exactly select the true owner. To evaluate OwnerHunter, we refine the publicly available English dataset ONER and construct the Chinese dataset WOI-cn with 16,036 real websites. Experimental results show that OwnerHunter achieves F1 scores of 0.9505 on ONER and 0.9621 on WOI-cn, setting new state-of-the-art performance on both datasets.1Codes are available at https://github.com/tuchen9/OwnerHunter

Abstract:
Caching web data on edge servers has become a common practice in latency-sensitive services to minimize data retrieval delays for web users. However, the geographic distribution of edge servers and frequent data transmissions make these systems vulnerable to security threats, particularly cache pollution attacks (CPAs). In such attacks, malicious users send excessive requests for unpopular data at abnormal frequencies, causing irrelevant content to be cached and degrading the system’s performance. Traditional CPAs, though impactful in conventional caching systems, are less effective in edge environments where user requests are more diverse and edge servers collaborate in caching strategies. In this paper, we identify a novel attack named data flipping attack (DFA) that targets the data transmission process among edge servers. This attack manipulates request distribution by swapping the frequencies of popular and unpopular data requests, all while maintaining other characteristics like request timing and user identity. This tactic disrupts caching strategies without raising suspicion. Experimental results indicate DFA is independent of user request patterns and demonstrates substantial effectiveness and robustness, successfully forcing edge web users to retrieve data from the cloud across various scales and configurations of edge networks. Furthermore, it evades detection by state-of-the-art methods that rely on specific distribution patterns, such as the Zipf distribution. To counter this attack, we propose an effective defense method that alters the request distribution by frequency distillation, mitigating its impact.

Abstract:
User identification technologies are essential for ensuring security and privacy. Compared to conventional biometric identification methods, electroencephalogram (EEG)-based brainprint recognition provides unique advantages, including non-replicability, resistance to coercion, and inherent liveness detection. However, existing EEG-based brainprint recognition methods are typically tailored for specific tasks and evaluated under conditions that differ substantially from real-world use. To overcome these limitations, we propose BrainprintNet, a convolutional neural network architecture integrating fine-grained filter banks, grouped multiscale temporal convolutions, and cross-band spatial fusion to enhance EEG-based brainprint recognition. BrainprintNet surpasses previous architectures in challenging scenarios involving simultaneous cross-session and cross-task recognition, demonstrating its generalization ability under strict simulation for real-world applications. Comprehensive experiments were conducted using three publicly available datasets encompassing nine distinct tasks. Furthermore, visualization of the learned network weights revealed strong correlations between user identity and specific EEG frequency subbands and channels. The proposed BrainprintNet significantly advances the accuracy, flexibility, and practical applicability of EEG-based brainprint recognition systems.

Abstract:
Wireless federated analytics face two critical challenges: data privacy and communication efficiency, since the local data may contain sensitive information and the users may be equipped with limited communication capability. Existing methods often adopt a direct combination of privacy-preservation schemes and compression mechanisms but overlook the privacy amplification effect from errors introduced in compression and wireless communication. With such consideration, a Differentially Private Quadrature Amplitude Modulation (DP-QAM) scheme, which leverages privacy amplification from both compression and noisy wireless channels, is proposed. The privacy guarantee is established in terms of the emerging f -DP, and the trade-off between privacy, communication cost, and accuracy in terms of mean square error (MSE) is characterized in the fundamental use cases of distributed mean estimation and frequency estimation, which outperforms the state-of-the-art methods. Moreover, the advantage of the proposed method over the classic Gaussian mechanism is further demonstrated from a rate-distortion perspective. Finally, extensive simulation results validate the effectiveness of the proposed mechanism.

Abstract:
Pedestrian Attribute Recognition (PAR) is an indispensable task in human-centered research and has made great progress in recent years with the development of deep neural networks. However, the potential vulnerability and anti-interference ability have still not been fully explored. To bridge this gap, this paper proposes the first adversarial attack and defense framework for pedestrian attribute recognition. Specifically, we exploit both global- and patch-level attacks on the pedestrian images, based on the pre-trained CLIP-based PAR framework. It first divides the input pedestrian image into non-overlapping patches and embeds them into feature embeddings using a projection layer. Meanwhile, the attribute set is expanded into sentences using prompts and embedded into attribute features using a pre-trained CLIP text encoder. A multi-modal Transformer is adopted to fuse the obtained vision and text tokens, and a feed-forward network is utilized for attribute recognition. Based on the aforementioned PAR framework, we adopt the adversarial semantic and label-perturbation to generate the adversarial noise, termed ASL-PAR. We also design a semantic offset defense strategy to suppress the influence of adversarial attacks. Extensive experiments conducted on both digital domains (i.e., PETA, PA100K, MSP60K, RAPv2) and physical domains fully validated the effectiveness of our proposed adversarial attack and defense strategies for the pedestrian attribute recognition. The source code of this paper will be released on https://github.com/Event-AHU/OpenPAR

Abstract:
Large Language Models (LLMs) have been widely adopted to enhance Task-Oriented Dialogue Systems (TODS) by modeling complex language patterns and delivering contextually appropriate responses. However, this integration introduces significant privacy risks, as LLMs, functioning as soft knowledge bases that compress extensive training data into rich knowledge representations, can inadvertently memorize training dialogue data containing not only identifiable information such as phone numbers but also entire dialogue-level events like complete travel schedules. Despite the critical nature of this privacy concern, how LLM memorization is inherited in developing task bots remains unexplored. In this work, we address this gap through a systematic quantitative study that involves evaluating existing training data extraction attacks, analyzing key characteristics of task-oriented dialogue modeling that render existing methods ineffective, and proposing novel attack techniques tailored for LLM-based TODS that enhance both response sampling and membership inference. Experimental results demonstrate the effectiveness of our proposed data extraction attack. Our method can extract thousands of training labels of dialogue states with best-case precision exceeding 70%. Furthermore, we provide an in-depth analysis of training data memorization in LLM-based TODS by identifying and quantifying key influencing factors and discussing targeted mitigation strategies.

Affiliations: Jiangxi Provincial Key Laboratory of Image Processing and Pattern Recognition, Nanchang Hangkong University, Nanchang, China; School of Electrical and Electronic Engineering, College of Engineering, Yonsei University, Seoul, Republic of Korea; PAMI Research Group, Department of Computer and Information Science, Centre for Artificial Intelligence and Robotics, Institute of Collaborative Innovation, University of Macau, Macau, China; School of Cyber Science and Engineering, Sichuan University, Chengdu, China

Abstract:
Palmprint recognition systems experience a significant performance decline in cross-domain scenarios due to domain shift caused by non-identity factors such as capture devices and lighting conditions. To address this issue, this paper introduces a novel deep decoupling framework, the Identity and Style Feature Decoupling Network (ISFDNet), designed to improve the model’s cross-domain generalization. ISFDNet explicitly separates stable identity-related information from variable domain-related style information within palmprint features. The framework incorporates two innovative mechanisms: at the feature level, the Spatially-Aware Separation Module (SASM) adaptively produces complementary spatial attention masks to decouple mixed features into identity and style components; at the image level, the Low-Frequency Disturbance Module (LFDM) creates stylized training samples by perturbing the low-frequency parts of images, encouraging the network to learn identity representations that are insensitive to style variations. Additionally, a carefully designed collaborative supervision strategy combines multiple losses to ensure effective decoupling. Extensive experiments on four publicly available palmprint datasets demonstrate that ISFDNet achieves top performance in both cross-domain and in-domain tests, while significantly enhancing the generalization capabilities of existing networks. The code is released at https://github.com/20201422/ISFDNet

Abstract:
Advanced Persistent Threat (APT) presents a significant challenge to the cybersecurity of contemporary organizations. This challenge is further exacerbated when APT actors collaborate with malicious insiders. The involvement of the insider transforms a bilateral adversarial scenario into a triadic strategic interaction, introducing additional layers of complexity in modeling and defense planning. Effective defense against insider-facilitated APT necessitates a comprehensive treatment of two critical aspects: (i) the dynamic strategic interactions among the three players—the defender, the insider, and the APT actor—and (ii) the impact of these interactions on the evolving state of the intranet. However, both dimensions are insufficiently addressed in existing research. To bridge this gap, we first develop an expected state evolution model that captures the real-time influence of the dynamic strategies of the players on the expected compromise state of the intranet. Building upon this, we formulate a three-player differential game model that explicitly incorporates the dynamic interactions of all participants. The associated optimality system is derived and numerically solved using a proposed iterative algorithm. The proposed algorithm achieves a 27.5% improvement in the organization’s expected payoff compared to baseline permissible strategies. Subsequently, we analyze key properties of the proposed framework and empirically evaluate the cost-effectiveness of the resulting defense strategy. To the best of our knowledge, this work represents the first application of three-player differential game theory in the domain of cybersecurity, offering a novel approach to defending against insider-facilitated APT.

Abstract:
State-of-the-art covert routing in heterogeneous networks (HetNets) focuses on balancing covertness and throughput, but often overlooks explicit energy optimization. While covert communication inherently limits transmit power, meeting throughput demands without coordinated design can still lead to high energy consumption. To this end, we propose DECOR, Decentralized Energy-efficient COvert Routing framework that jointly optimizes covertness, throughput, and energy efficiency. Unlike traditional methods that use a single wireless technology, DECOR leverages the diversity of available wireless communication technologies in HetNet to enable simultaneous multi-modal routing. The core idea behind DECOR is that optimal simultaneous utilization of multiple modalities improves throughput and overall energy efficiency. It minimizes the end-to-end energy consumption while satisfying stringent constraints on throughput and covertness through two core steps: (1) link-level optimization using sequential least squares programming (SLSQP), and (2) network-level optimization through a custom cluster-based routing strategy. DECOR introduces a novel clustering-based strategy that aggregates intra-cluster link information and delegates routing decisions to cluster heads, significantly reducing control overhead and enabling scalable, energy-efficient covert communication. Extensive numerical analysis demonstrates that DECOR significantly outperforms existing approaches in terms of energy-efficiency and data overhead.

Abstract:
Smart contracts play a pivotal role in blockchain ecosystems, and fuzzing remains a critical approach to securing them. However, existing smart contract fuzzers often optimize either seed generation or mutation scheduling in isolation and rely on narrow, fragmented feedback signals, leaving multi-transaction reasoning and stagnation recovery under-explored. In this work, we propose a Large Language Models (LLMs)-based Multi-feedback Smart Contract Fuzzing framework (LLAMA). Key components of the proposed LLAMA include: 1) a hierarchical prompting strategy that guides LLMs to generate structurally valid, context-aware multi-transaction initial seeds, together with a lightweight pre-fuzzing phase that validates and prioritizes high-potential LLM-generated candidates; 2) a multi-feedback-guided evolutionary optimization module that jointly optimizes seed selection and mutation scheduling by a group of constraints for driving an LLM-bootstrapped bandit scheduler. 3) an LLM-guided hybrid fuzzing module that integrates evolutionary fuzzing with a dual-channel recovery mechanism, which concurrently employs asynchronous coverage-stagnation-based LLM reseeding and selective symbolic execution to resolve complex path constraints. Our extensive experiments demonstrate that LLAMA outperforms state-of-the-art fuzzers in both coverage and vulnerability detection. Specifically, it achieves 92% instruction coverage on small contracts and 81% on large contracts, while detecting 132 out of 148 known vulnerabilities across diverse categories. Ablation studies further evidence that the proposed multi-feedback and hybrid recovery strategies have strong impact on LLAMA’s performance. The results explain LLAMA’s effectiveness, adaptability, and practicality in complex smart contract scenarios.

Abstract:
Non-transferable learning (NTL) has emerged as a promising method to protect the intellectual property of deep learning models by restricting cross-domain knowledge transfer. However, the robustness of its transferability constraints against potential attacks has not been explored, especially in practical deployment scenarios. In this paper, we propose a novel black-box attack framework, termed Distribution Drift Learner (DDL), which effectively bypasses NTL protection mechanisms by only accessing input-output queries of protected models. The theoretical foundation of DDL is derived from the concept of data drift, which takes advantage of the variability of the statistical distribution between the source and target domains. The core innovation of DDL is the integration of distribution perception regularization into a lightweight autoencoder architecture, enabling efficient manipulation of data distribution by optimizing dual objectives (distributed perception loss and reconstruction loss). Training for DDL involves two key steps: First, DDL reconstructs a moderate amount of target domain samples and feeds the reconstructed images into the NTL model to obtain prediction labels. The DDL parameters are then updated by optimizing distributed perception loss and reconstruction loss. Through extensive experiments against standard NTL benchmarks (Digits, CIFAR10, and STL10), we demonstrate that DDL has successfully overcome the barriers of the transferable NTL model and improved the accuracy of the target domain by 81% from 10%. Our work reveals critical vulnerabilities in the NTL framework, particularly with respect to ownership verification and applicability authorization mechanisms, providing valuable insights for developing more robust model protection strategies in real-world applications.

Abstract:
We introduce a comprehensive approach to enhance the security, privacy, and sensing capabilities of integrated sensing and communications (ISAC) systems by leveraging random frequency agility (RFA) and random pulse repetition interval agility (RPA) techniques. The combination of these techniques, which we collectively refer to as random frequency and pulse repetition interval agility (RFPA), with channel reciprocity-based key generation (CRKG) obfuscates both Doppler frequency and pulse repetition intervals (PRIs), significantly hindering passive adversaries’ ability to estimate radar parameters. In addition, a hybrid information embedding method integrating amplitude shift keying (ASK), phase shift keying (PSK), index modu13 lation (IM), and spatial modulation (SM) is incorporated to significantly increase the system’s achievable bit rate. Next, a sparse-matched filter receiver design is proposed to efficiently decode the embedded information with a low bit error rate (BER). Finally, a novel RFPA-based secret generation scheme using CRKG enables secure code creation without a coordinating authority. The improved range and velocity estimation, and the reduced clutter effects achieved by the method, are demonstrated through the evaluation of the ambiguity function (AF) of the proposed waveforms.

Abstract:
Anomaly detection plays a vital role in processing multi-source data through public cloud servers, yet existing privacy-preserving schemes fail to efficiently detect anomalies while protecting data source privacy. Although isolation forest offer advantages for unsupervised high-dimensional data analysis, implementing its tree-based privacy-preserving mechanisms remains challenging. In this paper, we propose IFAD, a novel isolation forest-based scheme for detecting anomalies in private data. IFAD guarantees end-to-end privacy protection by safeguarding original data, tree structures, and intermediate information throughout detection workflows. Our design achieves efficiency through three key contributions: 1) Cryptographic building blocks combining function secret sharing (FSS) and secret sharing (SS) to enable secure computations; 2) A split index protocol and layer update protocol to facilitate efficient, layer-by-layer isolation forest construction; 3) A detection phase optimization converting the anomaly score calculations into lookup table operations. Experimental evaluations demonstrate that IFAD achieves superior performance, outperforming prior schemes by 2.4× - 3.1× in runtime under LAN and WAN environments, and by 1.8× - 7.8× in online communication overhead, while maintaining comparable detection accuracy. Our solution establishes an effective balance between privacy preservation and operational efficiency for cloud-based anomaly detection.

Abstract:
With the rapid development of generative models, the visual quality of generated images has become almost indistinguishable from real images, which poses a huge challenge to content authenticity verification. A key limitation of existing detectors is their reliance on model-specific cues, resulting in poor generalization to unseen models. Based on the observation of local differences in the generated images, we found that the generated images lack device-specific sensor noise and unnatural pixel intensity variations caused by the oversimplified generation process. These discrepancies provide important forensic cues for distinguishing between real and generated images. We propose the Feature Aggregation for Localized Context and Noise Network (FALCON-Net), which leverages these discrepancies to enhance detection capabilities. FALCON-Net integrates two complementary modules to enhance detection capabilities: the Intrinsic Noise Pattern Isolation (INP) module isolates device-specific noise patterns by analyzing high-frequency features in the frequency domain, while the Local Variation Pattern (LVP) module models the complex relationships between local pixels to capture directional intensity variations and reveal unnatural regularities in generated images. By combining these sensor-level and local structural cues, FALCON-Net identifies fundamental generative inconsistencies, ensuring robustness to post-processing and strong generalization to unseen models. Extensive experimental results show that FALCON-Net achieves the state-of-the-art performance in detecting generated images and shows good generalization ability to unseen generative models. The code is available at https://github.com/humiaomiaohaha/FALCON-Net

Abstract:
Person re-identification (Re-ID) across RGB and depth modalities offers complementary cues for robust pedestrian matching under challenging conditions. However, the significant discrepancy between RGB appearance features and depth structural features complicates cross-modal alignment. Existing methods either depend on static architectural designs or impose strong constraints to capture the common features of the two modalities, often suffering from branch imbalance or distorted identity features. In this work, we propose a novel framework, Dynamic Correlation-Guided Disentanglement and Contrastive Learning (DCG-DCL), for RGB-D cross-modal Re-ID. First, the Dynamic Correlation-guided Disentanglement (DCGD) dynamically decouples features with the guidance of inter-modal correlation, which explicitly enforces common-feature learning via a cross-correlation constraint and adaptively separates common and unique components without predefined assumptions. Second, a Common & Unique Contrastive Learning (CUCL) strategy fully leverages these decoupled features, which aligns RGB/depth features closer to their common representation and pushes them away from unique redundancies. This dual mechanism effectively narrows modality discrepancy and boosts robustness against modality-specific noise. Extensive experiments on multiple public benchmarks demonstrate that our method achieves state-of-the-art performance, with ablation studies validating the necessity of each component.

Abstract:
To address growing concerns about data privacy on mobile devices, the federated learning (FL) paradigm enables clients to collaboratively train models while sharing only local model updates. However, privacy risks remain in FL, as adversaries can still infer sensitive information from these updates. To enhance secure aggregation in FL, various protection mechanisms combining encryption and multi-party computation (MPC) have been proposed. These approaches, however, often introduce substantial communication and computational overhead, making secure aggregation impractical on resource-constrained devices, e.g., smart phones. To tackle these efficiency challenges, we are among the first to propose the integration of differential privacy (DP) with encryption and MPC for secure aggregation. Our proposed protocol, Federated Learning with Noise-based Secure Aggregation (FedNSA), injects noise through DP to obfuscate individual model updates. Encryption is employed to correlate the noise across different clients, while MPC ensures perfect noise cancellation at the server side. Finally, we theoretically analyze its advantages and conduct extensive experiments on public datasets to demonstrate the superiority of our approach across multiple dimensions in comparison with the state-of-the-art baselines.

Affiliations: Department of Strategic and Advanced Interdisciplinary Research, Peng Cheng Laboratory, Control and Network Foundation Laboratory, Shenzhen, China; State Key Laboratory of Radio Frequency Heterogeneous Integration and Guangdong Key Laboratory of Intelligent Information Processing, College of Electronics and Information Engineering, Shenzhen University, Shenzhen, China; School of Intelligent Science and Engineering, Qinghai Minzu University, Xining, Qinghai, China; College of Computing and Data Science, Nanyang Technological University, Jurong West, Singapore

Abstract:
Wireless sensing is recognized as a promising technology for next-generation wireless networks, utilizing signals from devices such as Wi-Fi to detect and interpret human-related information, including movement status and sleep quality. However, the broadcast nature of wireless signals poses significant privacy risks, as unauthorized users may intercept these signals, leading to potential privacy breaches. Given the sensitive personal information embedded in CSI data and the limitations of encryption-based protection at the transmitter, conventional privacy protection measures are inadequate. Thus, developing physical layer-based privacy protection technologies for wireless sensing is urgently needed. In this paper, we consider a wireless sensing system model with privacy leakage and characterize wireless sensing performance as a classification problem. We propose a novel multi-antenna signal processing-based privacy protection strategy. To illustrate the fundamental tradeoff between sensing and privacy protection, we model the wireless sensing process as a communication task based on non-cooperative joint source-channel coding and introduce the concept of a sensing rate region. Our main contribution is the characterization of sensing and privacy protection performance at two key points within the sensing rate region: P_\mathrm OS , indicating the maximum sensing rate of a legitimate receiver constrained by the minimum achievable sensing rate of an illegitimate receiver, and P_\mathrm OPP , indicating the minimum sensing rate of an illegitimate receiver constrained by the maximum sensing rate of a legitimate receiver. Based on our analysis, we provide strategies to establish achievable boundaries between P_\mathrm OS and P_\mathrm OPP . Moreover, we define the secure sensing rate R_\mathrm PP , indicating the privacy protection performance of the system. Within this framework, we examine several illustrative examples, validated through numerical simulations.

Abstract:
The rapid growth of sensitive cross-domain data, such as electronic health records and genomic sequences in healthcare, presents significant opportunities for large-scale, multi-institutional collaborative analysis. Meeting stringent privacy regulations while utilizing data has become a critical challenge. Private set operations (PSO) play a crucial role to address this challenge. PSO protocols enable privacy-preserving data alignment across parties (such as interinstitutional data matching based on private set intersection (PSI)) and secure data aggregation (such as federated data aggregation through private set union (PSU)), providing fundamental support for cross-domain data collaboration. This type of technology is not only applicable to multi-center medical research but also has broad value in other scenarios requiring confidential data sharing. However, existing delegated/outsourced PSO schemes face two key limitations: 1) high client-side preprocessing overhead, requiring clients to expensively mask private data before uploading; 2) performance and single-point dependency bottlenecks in client-assisted computation where one client must act as a computational leader. To address these issues, we propose a secret-shared PSO framework for lightweight clients. In our scheme, the clients can go offline while servers perform all computations, significantly reducing clients’ burden. Notably, our protocol can be extended to support multi-party settings, making it well-suited for collaborative research across multiple institutions. In addition, we prove the security of all constructions under the semi-honest model. Experiments show that when the set size n \geq 2^16 , our protocol has a significant advantage and is well-suited for lightweight client that holds a large set.

Abstract:
Fine-tuning large language models (LLMs) on downstream tasks has become a standard approach to adapt their capabilities. However, the process raises privacy concerns when using sensitive datasets, prompting increasing interest in differentially private (DP) fine-tuning methods. While existing approaches build upon the seminal work of DP-SGD, they are constrained by the inherent inefficiency bottlenecks. In this paper, we investigate the potential of DP zeroth-order methods for LLM fine-tuning, which avoids the scalability bottleneck of SGD by approximating gradients with more efficient zeroth-order gradients. We propose the stagewise DP zeroth-order method (DP-ZOSO) that dynamically schedules key hyperparameters to leverage the synergy between DP random perturbation and the gradient approximation error. To further enhance the scalability, we propose DP zeroth-order stagewise pruning method (DP-ZOPO) which reduces the trainable parameters by a data-free pruning technique requiring no extra privacy budget. We provide theoretical analysis for both proposed methods and conduct extensive empirical analysis on both encoder-only masked and decoder-only autoregressive language model, achieving impressive results in terms of scalability and utility across diverse tasks.

Abstract:
Adversarial training (AT) is among the most effective defenses against adversarial attacks on deep neural networks. However, in real-world scenarios where data often follow long-tailed distributions, conventional AT methods struggle to handle such imbalance, resulting in severe robustness disparities across classes and limited overall robustness. Although recent efforts attempt to improve robustness through class frequency-aware weighting or distribution adjustments, our empirical analysis reveals that class frequency alone is an insufficient indicator of adversarial vulnerability, as robust accuracy does not correlate with the number of examples per class. Furthermore, AT under long-tailed distributions exhibits optimization instability, particularly for tail classes with limited data. To address these challenges, we present Tail-Aware Dynamic Adversarial Training (TAD-AT), which integrates three complementary components targeting the training loss, attack strategy, and weight average. TAD-AT captures data imbalance and performance disparity, improving adversarial robustness under long-tailed distributions. First, our training loss incorporates frequency- and accuracy-aware regularization to emphasize learning for vulnerable classes. Second, our attack adjusts perturbations based on class-wise vulnerability, encouraging robust feature learning around vulnerable regions, thereby mitigating robustness overfitting and improving clean accuracy. Third, our weight average improves robust generalization and training stability by adaptively controlling the decay rate across classes. Experiments on long-tailed benchmarks demonstrate that our TAD-AT significantly improves adversarial robustness, offering a systematic and practical solution to robustness challenges under long-tail distributions. Our code is publicly available on https://github.com/bookman233/TADAT.

Abstract:
Decentralized identity systems have emerged as a transformative paradigm, granting users unprecedented data sovereignty and privacy-preserving capabilities, fueling critical innovations in Web3 ecosystems. However, these systems primarily serve as identity-layer solutions, forcing verifiers to engineer bespoke cryptographic protocols for access control deployment, which is an error-prone and expert-dependent process. Moreover, existing approaches fail to effectively combat credential fraud (e.g., credential theft and revoked credential reuse) without compromising privacy guarantees. This paper presents FRAC (Flexible Fraud-Resistant Access Control), an efficient decentralized access control framework that achieves two paradigm shifts: 1) Streamlined access control deployment: a logic-centric paradigm encodes access criteria through declarative verification rules, eliminating manual cryptographic protocol design while enabling instant verifier onboarding and efficient presentation generation; 2) Provable fraud resistance: a format-agnostic defensive mechanism based on Merkle trees prevents malicious credential use, requiring only lightweight hash operations and signature verification instead of computation-intensive operations. We conduct rigorous security analysis based on universally composable security and evaluate the performance, demonstrating FRAC’s security and efficiency.

Abstract:
Adversarial examples present significant challenges to the security of Deep Neural Network (DNN) applications. Specifically, there are patch-based and texture-based attacks that are usually used to craft physical-world adversarial examples, posing real threats to security-critical applications such as person detection in surveillance and autonomous systems, because those attacks are physically realizable. Existing defense mechanisms face challenges in the adaptive attack setting, i.e., the attacks are specifically designed against them. In this paper, we propose Adversarial Spectrum Defense (ASD), a defense mechanism that leverages spectral decomposition via Discrete Wavelet Transform (DWT) to analyze adversarial patterns across multiple frequency scales. The multi-resolution and localization capability of DWT enables ASD to capture both high-frequency (fine-grained) and low-frequency (spatially pervasive) perturbations. By integrating this spectral analysis with the off-the-shelf Adversarial Training (AT) model, ASD provides a comprehensive defense strategy against both patch-based and texture-based adversarial attacks. Extensive experiments demonstrate that ASD+AT achieved state-of-the-art (SOTA) performance against various attacks, outperforming the APs of previous defense methods by 21.73%, in the face of strong adaptive adversaries specifically designed against ASD.

Abstract:
Specific Emitter Identification (SEI) leverages unique hardware-induced Radio Frequency Fingerprints (RFFs) for secure physical-layer authentication. However, under cross-receiver scenarios where training and testing data exhibit hardware-induced distribution shifts, deep learning models are prone to shortcut learning. In such cases, networks inadvertently exploit spurious, receiver-specific artifacts as “shortcuts” for identification rather than extracting the genuine, intrinsic fingerprints of the transmitter. To overcome this challenge, we propose a robust multi-task learning framework, termed MTL-SEI. This framework synergizes spectrum-based feature extraction with receiver-invariant adversarial training and channel-aware auxiliary supervision. Specifically, a gradient reversal layer (GRL) is employed to suppress receiver-dependent features, while an equalization-state prediction task provides semantic guidance to disentangle channel-induced distortions. Furthermore, an uncertainty-guided task weighting mechanism is introduced to dynamically balance the multiple optimization objectives based on predictive variance. Evaluations conducted on the ManySig dataset under a rigorous receiver-disjoint protocol demonstrate the superior generalization capability of MTL-SEI. Notably, our method achieves a transmitter identification accuracy of 88.50% —representing a 37.7% improvement over the 1D-CNN baseline—and yields an average performance gain of over 6.92% compared to state-of-the-art domain generalization methods. These results validate the effectiveness of the proposed feature disentanglement mechanism in mitigating receiver-induced biases.

Abstract:
Face recognition poses serious privacy risks due to its reliance on sensitive and immutable biometric data. While modern systems mitigate privacy risks by mapping facial images to embeddings (commonly regarded as privacy-preserving), model inversion attacks reveal that identity information can still be recovered, exposing critical vulnerabilities. However, existing attacks are often computationally expensive and lack generalization, especially those requiring target-specific training. Even training-free approaches suffer from limited identity controllability, hindering faithful reconstruction of nuanced or unseen identities. In this work, we propose DiffMI, the first diffusion-driven, training-free model inversion attack. DiffMI introduces a novel pipeline combining robust latent code initialization, a ranked adversarial refinement strategy, and a statistically grounded, confidence-aware optimization objective. DiffMI applies directly to unseen target identities and face recognition models, offering greater adaptability than training-dependent approaches while significantly reducing computational overhead. Our method achieves 84.42%–92.87% attack success rates against inversion-resilient systems and outperforms the best prior training-free GAN-based approach by 4.01%–9.82%. The implementation is available at https://github.com/azrealwang/DiffMI.

Abstract:
A straightforward method for multimodal palm-based authentication is to integrate palm shape into the system, which enhances reliability, security, and accuracy compared to unimodal methods. However, most existing methods rely on handcrafted feature extraction, which fails to fully exploit palm shape information. Moreover, there have been limited attempts to apply deep learning-based methods in this field. This paper explores a deep multimodal fusion method of palm vein (PV) and palm shape (PS) for authentication called multimodal contrastive learning channel-exchanging networks (MCCENet) to better utilize palm shape contour information. Specifically, we observe that the discriminative palm shape contour information is primarily captured in the shallow layers of the model, while the deeper layers tend to focus on irrelevant local high-level semantics. Based on this, we design hierarchical feature fusion (HFF), a module that enables inter-modal channel exchange at shallow layers. Further, we introduce a multimodal contrastive learning loss to align features across modalities, enhancing their representational embeddings. Extensive experiments across eight widely-used public datasets demonstrate that MCCENet achieves state-of-the-art performance in all cases.

Abstract:
Face morphing attacks pose a substantial risk to the reliability of face recognition systems used in passport issuance, border control, and digital identity verification. Detecting morphing attacks from a single facial image remains challenging owing to the lack of a trusted reference and the diversity of attack generation methods. This paper presents a new Single-Image Face Morphing Attack Detection (S-MAD) framework that integrates high-frequency Laplacian residual statistics with representations from a frozen, foundation-scale vision transformer. The approach employs residual-statistic-gated low-rank adapters (R-FLoRA) and feature-wise residual fusion (Res-FiLM) to enhance sensitivity to local morphing artefacts while preserving the semantic context of the backbone. A novel residual-contrastive alignment loss further regularises the fused token space, improving discrimination under unseen morphing conditions. Comprehensive experiments on four ICAO-compliant datasets, encompassing seven morph generation techniques, demonstrate that the proposed method consistently surpasses nine recent state-of-the-art S-MAD algorithms in detection accuracy and cross-domain (or dataset) generalisation. With a frozen backbone and minimal trainable parameters, the model achieves real-time efficiency and interpretability, making it suitable for real-life scenarios in biometric verification systems.

Abstract:
Random prime number generation is crucial for the implementation of several encryption and signature protocols in cryptographic applications. Recently, Quantum Random Number Generators (QRNGs) have emerged as a solution to obtain information-theoretically secure randomness, setting them apart from algorithmic generators. Nonetheless, their output is still prone to deterioration under a flawed implementation and thus requires continuous statistical validation. This process is uniquely challenging for prime number generators since they cannot be submitted to the traditional testing suites, which are primarily designed to test uniform pseudorandom sources. Here, we enable direct statistical testing of prime Random Number Generators (RNGs) by extending a validation framework based on an equiprobable binning of the prime distribution with a predictive machine learning model, which can independently learn correlations in the prime output. The validity of this framework is then verified through extensive cryptanalysis of a prime QRNG based on quadrature measurements of the vacuum state. The developed model successfully identifies flawed configurations of the prime QRNG for output lengths up to 128 bits. This assessment was additionally extended to a purely classical scheme derived from electronic noise measurements. Although no increase in the model’s predictive capacity is seen, our testing framework can nonetheless demonstrate that this classical source produces a conclusively biased prime distribution. In turn, the implemented QRNG remains resilient against this type of cryptanalysis.

Abstract:
Differentially private LoRA-based supervised fine-tuning of large language models (LLMs) often suffers from substantial utility loss, because standard DP-SGD repeatedly injects i.i.d. Gaussian noise over long training runs. To address this issue, we propose K -Temporal Correlated Differential Privacy ( K -TCDP), a finite memory noise mechanism that introduces controlled negative temporal correlation into full-dimensional noise in the LoRA adapter. This method preserves the Gaussian distributional form and remains compatible with standard Rényi differential privacy (RDP) accounting through a conservative per-step correction. By reshaping the temporal covariance over a finite window, K -TCDP allows part of the perturbation introduced at earlier steps to be offset later, which reduces cumulative perturbation energy. The method remains lightweight in practice, since it only changes the noise generation rule and the associated privacy accountant. We evaluate K -TCDP on four GLUE tasks and further assess generation quality on DART using BLEU-4 and ROUGE-L. Experimental results show that K -TCDP consistently outperforms DP-SGD at the same privacy levels while introducing only small computational overhead. It indicates that the proposed method provides a practical solution for better balancing the privacy-utility tradeoffs in private LoRA-based fine-tuning.

Abstract:
Large Vision Language Models (VLMs) are shown to be vulnerable to jailbreak attacks. Current attack detection methods often fall short due to their inability to comprehensively monitor the large input-output space or to accurately monitor safety-critical neurons associated with harmful semantics, resulting in high false positives and false negatives. In this paper, we introduce VLM-Guard, a highly effective detection framework that defends against jailbreak attacks by precisely identifying critical neurons linked to unsafe behaviors. Leveraging a tailored differential analysis over a large corpus of activation values, VLM-Guard isolates a compact set of neurons—just a few hundred, comprising less than 0.2% of the total—that are strongly correlated with harmful semantics. This enables the design of an attack detector that is not only effective at monitoring adversarial behavior but also lightweight and training-free (i.e., no parameter updates or model fine-tuning), making it well-suited for practical deployment. Extensive evaluations demonstrate that VLM-Guard excels in detecting jailbreak attacks while preserving benign performance in attack-free settings, offering an effective and efficient solution for safeguarding VLMs.

Abstract:
While person re-identification (ReID) is widely deployed, its security against sophisticated attacks remains a critical concern. Existing attacks based on pixel-level perturbations or backdoors provide only a partial view of vulnerabilities and are often impractical in realistic open-set scenarios. To overcome these limitations and ultimately strengthen robustness, we propose the Diffusion-based Semantic Camouflage Attack (DSCA), a framework that exposes vulnerabilities in the high-level semantic space. Rather than manipulating pixels, DSCA instantiates a conditional diffusion generator that subtly edits latent semantic attributes (for example, clothing color and texture) while preserving visual coherence, thereby impersonating a specified target identity at inference. This generative formulation enables operation in a zero-query, black-box setting without access to or feedback from the victim model. In the offline setting, the network is trained on a diverse set of surrogate ReID models with different backbones, including CNN and Transformer architectures, to encourage cross-model transferability. In the online setting, DSCA directly produces a camouflage image that deceives the victim system into matching the attacker with a specified target identity, achieving a successful attack without any model interaction. Extensive experiments on major ReID benchmarks validate the approach, showing high attack success rates (over 95% in our setting), strong perceptual fidelity, and evasion of advanced defenses. By exposing security gaps at the semantic level, DSCA provides a practical diagnostic tool to inform defense objectives and guide the development of more robust ReID systems.

Abstract:
Palmprint recognition has been extensively studied as an effective biometric technique for personal identification. With the rapid development of deep neural networks, palmprint recognition methods have achieved remarkable progress. However, their performance often deteriorates significantly under domain shifts. Moreover, existing unsupervised domain adaptation approaches for palmprint recognition typically suffer from unstable training and imprecise feature alignment, thereby limiting their effectiveness. To address these challenges, we propose SPA, a Stable and Precise Alignment framework for cross-domain palmprint recognition. Specifically, we design a lightweight yet robust Style Transformation Module (STM) to mitigate variations in style, color, and illumination. With the aid of STM, we further align joint feature distributions across all high-level layers, achieving more accurate feature alignment and enhancing recognition robustness. We conduct extensive experiments on two public multi-domain palmprint databases encompassing 42 cross-domain scenarios. The results demonstrate that SPA consistently delivers superior performance across both databases, achieving higher recognition accuracy with lower computational overhead compared to existing methods. In particular, SPA improves the average identification accuracies to 94.21% and 81.93%, while reducing the average equal error rates to 1.36% and 3.62% on the two databases, respectively.

Abstract:
The remarkable capability of large language models (LLMs) has led to the wide application of LLM-based agents in various domains. To standardize interactions between LLM-based agents and external resources, model context protocol (MCP) tools have become the de facto standard and are now widely integrated into these agents. However, the incorporation of MCP tools introduces the risk of tool poisoning attacks, in which malicious MCP tools can steer the behavior of LLM-based agents toward unintended outcomes. Although previous studies have identified such vulnerabilities, their red teaming approaches have largely remained at the proof-of-concept stage, leaving the automatic red teaming of LLM-based agents under the MCP tool poisoning paradigm an open question. To bridge this gap, we propose AutoMalTool, an automated red teaming framework for LLM-based agents by generating malicious MCP tools. Our extensive evaluation shows that AutoMalTool effectively generates malicious MCP tools capable of manipulating the behavior of mainstream LLM-based agents while evading current detection mechanisms, thereby revealing new security risks in these agents.

Abstract:
Outsourcing image management to a cloud should not only protect the confidentiality of image data, but also maintain the capability of reverse image search, which requires identifying the existing stored images that are similar to an input image. Previous studies build on cryptographic approaches to realize reverse image search on encrypted images, yet failing to achieve either security or performance. This paper explores trusted image search, which uses Intel SGX to realize reverse image search in an enclave, in order to provide security guarantees via SGX while performing search on plain data (inside the enclave) for performance. However, due to the resource limits of SGX, directly realizing the search process in the enclave incurs high performance overhead. We present TrustSearch, which implements various design approaches to mitigate the resource overhead of SGX. We evaluate TrustSearch using real-world image datasets, and show that it outperforms state-of-the-art approaches for search performance while preserving space efficiency for the enclave.

Abstract:
Current studies have discovered that model extraction attacks (MEA) can steal the functionality of deep learning (DL) models, thus causing economic loss and other security threats. Extraction attackers can build a clone model locally that has a different structure but similar functionality to the victim model. To counter MEA, defenders utilize realistic auxiliary data to enhance the victim model and produce misleading predictions for attack data. However, these defense methods have three critical problems caused by utilizing realistic auxiliary data. First, in some scenarios, realistic auxiliary data is absent and difficult to obtain. Secondly, the defense effectiveness brought by realistic auxiliary data is unstable. Finally, the realistic auxiliary data did not protect all categories of training data, resulting in higher clone accuracy for some categories. To address these issues, we propose Model Defense Variational Autoencoder (MDV) to generate virtual auxiliary data as a replacement for realistic auxiliary data. MDV combines the Variational Autoencoder (VAE) and classifier, compelling the latent features to obey different multivariate Gaussian distributions according to the categories. Then, MDV samples deep features from low-likelihood regions of different distributions as realistic auxiliary data. During the experimental phase, we apply our auxiliary data to different defense methods that use auxiliary data and compare the defense effects in different scenarios. Experimental results demonstrate that our method effectively addressed the three aforementioned issues.

Abstract:
Prompt learning is a new machine learning paradigm that has attracted ample attention due to its simplicity and proven efficacy. Despite its growing adoption, the security vulnerabilities associated with this paradigm remain underexplored. In this work, we take the first step to propose BadBone, a stealthy and adaptive backdoor attack against prompt learning using bi-level optimization. Instead of backdooring the prompt learning process, we aim to compromise a backbone model such that only target downstream tasks employing prompt learning inherit the backdoor vulnerability. Extensive experiments on three different models and three datasets from various domains show that our targeted/untargeted backdoored models achieve high attack performance while maintaining utility on both pre-training and downstream tasks. Moreover, we evaluate our approach against six state-of-the-art model-level defenses, including Neural Cleanse, ABS, MNTD, NAD, CLP, and D-BR. The results demonstrate that these defenses are largely ineffective against our backdoored models and thus leave the effective defense as an important direction for future work. Our code is available at https://github.com/TrustAIRLab/BadBone

Abstract:
Deep Neural Networks (DNNs) are gradually becoming indispensable in various technological domains. To cater to more deployment backends and increasingly complex model architectures, deep learning compiler-driven efficient compilation modes are becoming essential components of productivity. However, this deployment method exacerbates security risks. Recent studies have shown that attackers can reverse-engineer executable files to regenerate trainable deep learning models, leading to adversarial attacks and other security breaches. Previous research indicates that such attacks pose significant threats, yet progress in implementing cost-effective mitigation strategies remains limited. Existing defense mechanisms primarily focus on Trusted Execution Environments or partial encryption to protect critical model parameters, often at the expense of compiled execution efficiency. To address this gap, we propose a schedule search based operator obfuscation method (SOOM) to defend against model extraction attacks for models compiled and executed on standard CPU and GPU backends, where low latency on device inference is required. SOOM is built on TVM, a deep learning compiler, and constructs a comprehensive obfuscation space for deep learning operators. It leverages a security aware learned cost model based on XGBoost gradient boosted trees to balance security objectives and performance requirements, and ultimately generates obfuscated executable code for various deep learning operators. Extensive experiments covered over 105 operator configurations and more than 30,000 tensor computation test cases. Our method was tested against state-of-the-art model extraction attacks, raising the operator inference failure rate to as high as 89%. We also observe up to approximately 25.4% performance gains in selected cases, while the balanced setting keeps model-level latency overhead within a modest budget.

Abstract:
This work addresses the problem of secrecy energy efficiency (SEE) maximization in RIS-aided wireless networks. Active and nearly-passive RISs are compared and their trade-off in terms of SEE is analyzed. Considering both perfect and statistical channel state information, two SEE maximization algorithms are developed to optimize the transmit powers of mobile users, the RIS reflection coefficients, and the base station receive filters. Numerical results quantify the trade-off between active and nearly-passive RISs in terms of SEE, with active RISs yielding worse SEE values as the static power consumed by each reflecting element increases.

Abstract:
Video has become increasingly widespread in information transmission and daily communication. However, video data often contain sensitive privacy information, which leads to privacy concerns. To address these concerns, video encryption has become one of the most effective privacy protection methods. Current efforts in video encryption focus on block-level or frame-level encryption, failing to provide fine-grained privacy protection. In response, we propose a personalized pixel-level video encryption approach for privacy protection. Our approach operates before encoding and is robust to lossy compression. It applies a quadruple encryption algorithm directly to the pixels within privacy-sensitive areas segmented by instance segmentation, achieving fine-grained, pixel-level encryption. Different encryption keys are managed using Attribute-Based Encryption (ABE) technology, which enables personalized access control. This approach assigns different permissions based on user identity, enabling multi-level access to encrypted video content. We deploy our approach across different devices and perform extensive experimental evaluations. Experimental results show that on different devices, the single-frame encryption overhead of our approach for videos of various resolutions consistently remains below 0.12 seconds, with the encrypted area achieving an average PSNR of 8.50 dB and an average SSIM of 0.079. When using a key with one bit difference from the correct key for decryption, the average number of pixels changing rate (NPCR) of the test sequence is 99.59% and the average unified average change intensity (UACI) is 32.40%. This shows that unauthorized users cannot obtain valid information when decrypting with an incorrect key.

Abstract:
The critical energy infrastructure is undergoing two significant transformations: the rapid increase in renewable distributed energy resources (DER) and the digitalization of the energy sector, collectively shaping what is known as the Internet of Energy (IoE). Artificial intelligence (AI) has become a widely adopted tool for effectively allocating energy and managing sector-related resources, where ensuring fairness is essential. While inherent unfairness in AI systems is well acknowledged, little attention has been given to evaluating this unfairness and its real-world implications within the context of the IoE. In this study, we take a first step to elucidate the unfairness in AI-powered IoE systems induced by malicious users. We introduce Unfairness Score (UScore), a novel metric designed to evaluate the unfairness of machine learning models in real-world IoE scenarios. We then extensively evaluate unfairness attacks using three IoE tabular datasets, demonstrating that AI model fairness can be compromised through data poisoning, whether in centralized learning (CL) or federated learning (FL) settings. Notably, such compromises can occur when malicious users tamper with only a small subset of the data they control. Finally, we propose a novel approach that unifies fairness and differential privacy (DP) by leveraging DP as a provable defense mechanism. This approach provides a universally applicable solution to unfairness attacks, regardless of whether the learning tasks are classification or regression, and is effective in both FL and CL settings. Our contributions represent a significant step in addressing unfairness and privacy concerns in AI-powered IoE systems.

Abstract:
Decentralized federated learning (DFL) has been widely used in edge computing and Internet of Things (IoT) settings with many devices. However, the heterogeneity of devices (e.g., varying computational capacity, stability, security requirements) can impact the performance of DFL applications. Our proposed SSAA, a secure aggregation scheme for DFL on heterogeneous devices, presented in this paper is designed to improve the efficiency of DFL while preserving privacy. Specifically, SSAA accelerates aggregation by synchronously coupling aggregation device-set formation with aggregation computation, and can cope with device availability and performance variability to maintain stable and efficient aggregation. By extending homomorphic encryption to support cross-round ciphertext continuity, SSAA enables reliable and secure decryption under large-scale dropouts in the original aggregation set, making it practical for dynamic DFL environments. In addition, we prove that SSAA is semi-honestly secure and resistant to device collusion attacks – fundamental security requirements for applications involving heterogeneous devices. We also implement SSAA and comprehensively evaluate its performance to demonstrate its practicability.

Affiliations: School of Artificial Intelligence and Information Engineering, Zhejiang University of Science and Technology, Hangzhou, China; School of Electronic and Information Engineering, Liaoning University of Technology, Jinzhou, China; College of Computer Science and Technology, Zhejiang University, Hangzhou, China; School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China; Ministry of Education Key Laboratory for Intelligent Networks and Network Security, Xi’an Jiaotong University, Xi’an, China; Zhejiang Key Laboratory of Artificial Intelligence of Things (AIoT) Network and Data Security, Hangzhou, China; Center for Biometrics and Security Research and the National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China

Abstract:
With the rapid development of Large Language Models (LLMs), an increasing number of Large Visual-Language Models (LVLMs) have achieved unprecedented performance in response generation. Recent work shows that LVLMs are vulnerable to adversarial attacks. However, many existing methods tend to overfit to the source model by overemphasizing specific features, which compromises their transferability. Other approaches suffer from reduced attack effectiveness due to insufficient differentiation between features. In this paper, we propose a novel transfer-based black-box untargeted attack—Shared Adversarial Feature (SAF) dynamic attack. By exploring the feature extraction patterns of LVLMs, we identify the features shared among various models that are most susceptible to adversarial attacks and disrupt them. Moreover, due to the powerful attention mechanisms of LVLMs, they are still able to extract similar semantics from perturbed images, even when primary features are disrupted. We design a dynamic update strategy to address this challenge. Finally, from the perspective of SAF, we conduct an in-depth analysis of vulnerabilities in the vision encoder and projector within LVLMs and find that attacking the projector exhibits stronger transferability across heterogeneous model architectures. Extensive experiments show that our method exhibits superior attack performance compared to existing methods across different models, datasets, and tasks.

Abstract:
One of the biggest risks that wireless IoT networks encounter is malware or botnet epidemics. Malware can propagate from one device to another device existing in its coverage range as long as there are no check points (firewalls) to protect that device. Firewalls can be hardware (special devices) or software licenses installed on a limited number of devices. Unfortunately, in both cases the number of firewalls in any network is usually limited due to cost constraints. Therefore, it is crucial to reduce the number of required firewalls and/or make efficient use of the available firewalls. In this paper, we consider two optimization problems to optimize the firewall placement in a massive wireless IoT network. The objective of the first optimization problem is to reduce the number of firewalls required to partition the network into isolated clusters/partitions of a given maximum size. The second problem aims at reducing the maximum size of the isolated partitions that can be achieved given an available number of firewalls. These two clustering problems are non-convex and are known to be NP-complete. However, we provide efficient algorithms, with different variations, to solve the two problems, and we compare their performance to the well known K-Means and Spectral Partitioning algorithms. Simulation results show that in both problems the average performance of the proposed algorithms outperforms the performance of both algorithms. Furthermore, we show that adding the proposed algorithm as a second stage after the Spectral Partitioning algorithm improves the performance of the Spectral Partitioning significantly.

Abstract:
Face anti-spoofing, which aims to prevent the attacks of widely-used face recognition systems, is highly related to personal privacy and property security. However, existing benchmarks on face anti-spoofing mainly focus on RGB images, further challenged by consistently developed 3D high-fidelity (HiFi) masks. To facilitate the research on multimodal face anti-spoofing, we construct the HyperSpectral Face Anti-Spoofing (HySpeFAS) dataset. We introduce the newly-developed snapshot spectral imaging (SSI) technology to capture real and spoof faces, as well as identify unknown HiFi masks. Specifically, hyperspectral images (HSIs) acquired by SSI sensor contain rich information about the chemical composition of the targets, which can be used to effectively distinguish live human skin and various spoof materials. The HySpeFAS dataset contains 22,368 multimodal images (i.e., RGB, SSI, HSI) of 17 live subjects and 60 spoof subjects. Moreover, extensive experiments with baseline deep learning models validate the special features of the SSI images and the potential of SSI in FAS. By publishing the dataset as well as the baseline models, we encourage the community to foster the algorithm study associated with hyperspectral images and the development of SSI-equipped intelligent systems.

Abstract:
In the Segment Routing over IPv6 (SRv6) network, a wide range of network events (e.g., attacks, intrusions, violations, malicious route announcements) may occur. Network management requires real-time monitoring of untrusted and unreliable environments (e.g., unsafe components and devices). Early localization of abnormal links causing violations in the SRv6 network helps minimize the compensation required for service unavailability. However, the overhead of the state-of-the-art methods does not scale efficiently to large-scale SRv6 networks and exhibit poor robustness to addressing various disturbances from unreliable networks. To cope with these challenges, we propose \textsf Glint , an in-band network telemetry framework to localize abnormal links in SRv6 networks. The key idea of \textsf Glint is sampling part of the information while the overall information is known. \textsf Glint provides probabilistic in-band collection to gather segment-level telemetry data, reducing overhead and improving efficiency. \textsf Glint also proposes distributed verification-based detection to enhance the trustworthiness of security assessments, further improving robustness against disturbances. In addition, we design selective telemetry that reduces telemetry reports while preserving security-relevant visibility. Our evaluations demonstrate that, compared to the state-of-the-art frameworks, \textsf Glint significantly reduces header bandwidth overhead by 75.6% and memory overhead by 48.7% while reducing false positives. We also implement \textsf Glint on the Intel Tofino switch, achieving over a 50% reduction in hardware resource consumption compared to existing methods.

Abstract:
In recent years, with the widespread application of Vision Transformer (ViT) in visual trackers, their robustness has received increasing attention. However, by focusing on global interactions between image patches, ViT reduces sensitivity to local noise, posing new challenges for adversarial attacks. Meanwhile, existing decision-based adversarial attack methods often overlook the differences in noise sensitivity between different patches, further limiting the compression efficiency of adversarial noise, especially in ViT. In visual tracking, existing adversarial attack methods primarily target Siamese network-based trackers, and research on adversarial attacks against Transformer-based trackers, particularly decision-based black-box attacks, is still relatively limited. To implement effective black-box attacks on Transformer-based trackers, this paper innovatively proposes patch-based adversarial noise compression (PANC), a decision-based adversarial attack method. This method effectively compresses adversarial noise patch by patch, significantly improving compression efficiency and attack concealment. PANC also introduces a noise sensitivity matrix that dynamically adds and reduces adversarial noise, optimizing the spatial distribution of noise while decreasing the number of queries. We validated the effectiveness of the proposed PANC attack method on several Transformer-based trackers, including OSTrack, STARK, TransT, and MixformerV2, and three public large-scale benchmark datasets: GOT-10k, TrackingNet, and LaSOT. Experimental results show that compared to the existing state-of-the-art adversarial attack method, the IoU attack, PANC compresses the noise level to 10%, improving the attack effectiveness by 162% with the number of queries of only 45.7%. Furthermore, PANC can serve as an initialization or post-processing optimization strategy for other adversarial attack methods, providing a more flexible and efficient mechanism for adversarial example generation. Our work reveals the vulnerabilities of existing Transformer-based visual trackers and offers new ideas for further improving the efficiency and concealment of adversarial attacks.

Abstract:
LoRa has emerged as a strong wireless communication technology for IoT devices. The security challenges of LoRa have also raised various concerns. We propose SymScrab, a simple physical layer solution that significantly improves the security of LoRa without affecting the communication performance or consuming more communication resources. SymScrab encrypts the message by scrambling the baseband samples with a pseudo-random permutation, which is easy to recover by the legitimate receiver who has the same permutation, but impossible for an adversary. We further propose a physical layer Message Integrity Check (MIC) that rejects spoofed packets because spoofed packets do not exhibit expected physical layer features. We provide a rigorous security analysis of SymScrab and prove guarantees for confidentiality and integrity. We implement SymScrab as an extension to the open-source implementation of LoRa, and test it with both over-the-air experiments and simulations. The results confirm that SymScrab does not negatively affect communication performance.

Abstract:
In the Machine Learning as a Service paradigm, a service provider (e.g., a server) hosting a model offers inference APIs to clients, who can send their queries and receive the inference results. While most recent secure inference works focus on addressing privacy issues, they overlook the importance of checking the service quality and reliability. A malicious server may deviate from the protocol specification to deliberately provide incorrect services such as using low-quality models. Thus, it is necessary to design new solutions to empower clients to verify the server’s model accuracy and inference integrity while protecting both parties’ privacy. We present \textsf Conan , a new secure and reliable inference framework against malicious servers to achieve accuracy verification, inference integrity, and privacy simultaneously. In \textsf Conan , the server first commits to the model and proves in zero-knowledge that the committed model achieves the claimed accuracy. Then both parties perform secure inference on the committed model against the malicious server. To instantiate the above framework, we design generic maliciously secure two-party computation (2PC) protocols with a fixed corrupted party, which may be of independent interest. Our protocols achieve high efficiency by utilizing the advantage that the semi-honest party can check the behavior of the corrupted party. Furthermore, they support both arithmetic and Boolean circuit evaluation, a crucial attribute for secure inference on complicated machine learning models. We implement the fixed-corruption 2PC protocols for our secure and reliable inference. The experimental results show 1～ 2 orders of magnitude improvements over conventional maliciously secure protocols in terms of communication and computation costs.

Abstract:
Set-valued data, an important data form expressing diversity and uncertainty, is widely used in fields such as recommendation systems and social network analysis. However, such data often contains fine-grained records, which may lead to the leakage of users’ sensitive information. To this end, some privacy-preserving set-valued data analysis schemes have been proposed. This paper first proposes a joint shift inference attack against the cyclic shift-based local differential privacy (LDP) protocol introduced by Huang et al., which exploits deterministic cyclic-shift patterns and significant frequency differences to infer the user’s original data. Experimental results demonstrate that the user’s original data can be inferred with a probability exceeding 96%. Theoretically, the cyclic-shift mechanism violates the core requirement of local differential privacy due to its non-surjective output space. To overcome the limitations of existing schemes, we propose a privacy-preserving joint distribution analysis scheme for set-valued data via LDP (SVJDA). It employs the Sparse Vector Mean Estimation (SVME) mechanism and utilizes a sign-based hashing function to compress data, allowing privacy-preserving joint distribution analysis while introducing only minimal additional noise. Theoretical analysis shows that SVJDA satisfies \epsilon -LDP with minimal estimation error. The experimental results confirm that SVJDA achieves higher accuracy in joint distribution estimation while ensuring the accuracy of frequent itemset identification. For \epsilon \in [0.4, 1] , the L_\infty error of SVJDA is only 7.208%-21.725% of SVSM and 2.821%-7.279% of LDP-RM, while its MSE is 0.00364%-2.338% of SVSM and 0.017%-0.068% of LDP-RM, demonstrating its superior performance.

Abstract:
With the rapid advancement of 6G, identity authentication has become increasingly critical for ensuring wireless security. The lightweight and keyless Physical Layer Authentication (PLA) is regarded as an instrumental security measure in addition to traditional cryptography-based authentication methods. However, existing PLA schemes often struggle to adapt to dynamic radio environments. To overcome this limitation, we propose the Adaptive PLA with Channel Extrapolation and Generative AI (APEG), designed to enhance authentication robustness in dynamic scenarios. Leveraging Generative AI (GAI), the framework adaptively generates Channel State Information (CSI) fingerprints, thereby improving the precision of identity verification. To refine CSI fingerprint generation, we propose the Collaborator-Cleaned Masked Denoising Diffusion Probabilistic Model (CCMDM), which incorporates collaborator-provided fingerprints as conditional inputs for channel extrapolation. Additionally, we develop the Cross-Attention Denoising Diffusion Probabilistic Model (CADM), employing a cross-attention mechanism to align multi-scale channel fingerprint features, further enhancing generation accuracy. Simulation results demonstrate the superiority of the APEG framework over existing time-sequence-based PLA schemes in authentication performance. Notably, CCMDM exhibits a significant advantage in convergence speed, while CADM, compared with model-free, time-series, and VAE-based methods, achieves superior accuracy in CSI fingerprint generation. The code is available at https://github.com/xiqicheng192-del/APEG

Abstract:
Federated learning (FL) enables collaborative training across distributed clients but remains vulnerable to Byzantine attacks, especially stealthy ones. The threat is even amplified in non-IID settings, where client heterogeneity causes greater divergence in feature distributions and inter-client distances. Existing defenses often rely on strong assumptions or raw update distances, limiting their effectiveness under such heterogeneity. To address this gap, we propose FedRefiner, a decoupled dual-refined aggregation algorithm designed to mitigate stealthy attacks on heterogeneous data. Our intuition is that the significance distribution of client updates reveals subtle malicious evasion, altering critical features for attack while perturbing unimportant ones, thereby exposing true inter-client distances. FedRefiner goes beyond norm-based filtering by refining both weighted scores and aggregated updates, enabling more accurate distinction between malicious behavior and benign non-IID variation. It first derives significance distribution vectors as refined updates by sparsity, then clusters them to compute weighted similarity scores for group reliability. These clusters then align raw updates into groups for group-wise refinement, yielding robust aggregated updates. We theoretically prove the convergence of FedRefiner under Byzantine attacks in non-IID settings. Extensive evaluation on 8 datasets against 10 attacks (including 2 adaptive ones) and 13 defenses shows that FedRefiner outperforms state-of-the-art defenses, achieving up to a 10% gain in overall accuracy and a 14.8% improvement in worst-case performance under both IID and non-IID settings. Ablation studies further demonstrate its robustness across different hyperparameters, attacker ratios, data heterogeneity, and model/client scales, while incurring low computation and no storage overhead.

Abstract:
Autonomous Vehicles (AVs) rely extensively on GPS signals for navigation, exposing them to a wide range of GPS spoofing attacks, from simplistic signal manipulation to sophisticated, coordinated falsification. Existing detection and mitigation solutions, both conventional and AI-based, face several critical limitations: they struggle to adapt to novel or evolving GPS spoofing strategies, rely on shallow or handcrafted features that fail to capture the semantic complexity of signal distortions, and often lack scalability, as they are typically designed for isolated scenarios and cannot be readily extended to heterogeneous AV fleets. In this study, we introduce a novel framework that integrates multiple Lightweight Language Models (LightLMs), including BERT, RoBERTa, DistilBERT, and TinyBERT, with Reinforcement Learning (RL) algorithms such as Q-Learning, Deep Q-Network, Advantage Actor-Critic, and Proximal Policy Optimization, to improve detection and mitigation of GPS spoofing. The LightLMs are used to convert structured GPS-related features into enriched state embeddings, which serve as input to the RL agent. These embeddings provide semantically meaningful representations that help the agent recognize complex spoofing behaviors and apply mitigation strategies. To train and evaluate the proposed models, we build a Python-based simulation environment that emulates multiple spoofing scenarios and integrates LightLM-driven state inputs. Experimental results across the datasets used show that RL models enhanced with LightLM-generated state representations significantly outperform their baseline counterparts in detection accuracy, mitigation efficiency, and response time. The results also demonstrate the proposed approach’s scalability, generalizability, and operational reliability for secure AV navigation.

Abstract:
Ateniese et al. (EuroS&P 2017) proposed the notion of redactable blockchains (RBs), in which a designated party uses a secret key to modify blockchain history without causing a hard fork. Nevertheless, redactions may be performed mistakenly or maliciously due to misbehavior or operational errors. From a regulatory perspective, any RB design must therefore incorporate accountability and traceability mechanisms to ensure that redactions are non-abusive and publicly verifiable. As a countermeasure, we propose the notion of time-updatable policy-based chameleon hash (TPCH). This construction addresses regulatory concerns by enabling publicly verifiable proofs of redaction and traceable user identities. Our basic building block, termed time-updatable chameleon hash (TUCH), provides redaction accountability through an intrinsic property formalized as Type-2 Trapdoor Collisions. TUCH is functionally versatile and achieves acceptably efficient performance compared to peer chameleon hash schemes. Following the heuristics of Camenisch et al. (PKC 2017) and Derler et al. (NDSS 2019), we further extend TUCH by integrating attribute-based encryption (ABE) to obtain a time-updatable, policy-based variant, namely TPCH. The resulting scheme overcomes the limitations of coarse-grained redaction and the impracticality of specifying the exact modifier in advance. Overall, TPCH provides a secure, efficient, and comprehensive solution for accountable and traceable redactable blockchains under practical regulatory requirements. Our systematic analysis further demonstrates the suitability of TPCH for small scale deployment.

Abstract:
A dealer aims to share a secret with participants so that only predefined subsets can reconstruct it, while others learn nothing. The dealer and participants access correlated randomness and communicate over a one-way, public, rate-limited channel. For this problem, we propose the first explicit coding scheme able to handle arbitrary access structures and achieve the best known achievable rates, previously obtained non-constructively. Our construction relies on lossy source coding coupled with distribution approximation to handle the reliability constraints, followed by universal hashing to handle the security constraints. As a by-product, our construction also yields explicit coding schemes for secret-key generation under one-way, rate-limited public communication that, unlike prior work, achieves the capacity for arbitrary source correlations and do not require a pre-shared secret to ensure strong secrecy.

Abstract:
The increasing demand for efficient and low-power deep neural network (DNN) inference has advanced the adoption of ReRAM-based compute-in-memory (CiM) accelerators, which perform computations directly within memory to reduce energy consumption and enhance throughput. However, such architectures are vulnerable to security threats, especially in a multi-tenant environment where multiple users share the same physical resources. This paper introduces a new attack model for multi-tenant ReRAM-based CiM, power hammering, that exploits the temperature sensitivity of ReRAM cells, inducing local temperature increases that lead to conductance drift and ultimately result in erroneous inference outcomes. This serves as a denial-of-service (DoS) attack, where malicious co-tenants degrade inferencing accuracy and system reliability for legitimate users in a shared environment, ultimately undermining trust and causing potential losses to the service provider. Additionally, we propose a novel strategy to counter this security vulnerability. In this technique, we focus on selectively protecting important weights with error compensation hardware. These important weights are treated as faults, and their computation is offloaded to compensation hardware. Simulation results confirm the effectiveness of the proposed method in ensuring accurate classification results even under adversarial conditions, thereby enabling secure multi-tenant inference on ReRAM-based CiM accelerators.

Abstract:
Unsupervised Visible-Infrared Person Re-Identification (US-VI-ReID) has great potential prospects because it does not require label information. However, corrupted pedestrian images collected due to corruption factors in real-world scenarios (e.g., noise, blur, and weather changes) largely limit the scalability of US-VI-ReID. In this paper, we explore the robustness of US-VI-ReID for the first time and propose a Multi-Granularity Spatial-Frequency Prototype Learning (MSPL) framework. The framework mainly consists of Multi-Channel Soft Augmentation (MSA), Robust Frequency Domain Feature Learning (RFL) module and Cross-modal Spatial-Frequency Prototype Matching (CSPM). Specifically, the MSA alleviates the sensitivity of model to color and abnormal samples through rich channel combinations and soft erasing. Subsequently, the RFL performs deep global filtering and amplitude attention compensated InstanceNorm to complete frequency and style modulation, concentrating on degradation-robust frequency content. Finally, the CSPM is designed to achieve multi-granularity prototype contrastive learning on cluster level and view level, then conduct cross-modal matching of multi-granularity spatial-frequency prototypes, thus establishing robust label association. With the above modules, our proposed framework can learn corruption-invariant feature components and generate robust cross-modal correspondence from unlabeled cross-modal images. Extensive experiments demonstrate that our MSPL outperforms other state-of-the-art methods by a large margin on the challenging SYSU-MM01-C and RegDB-C, while maintaining competitive on the SYSU-MM01 and RegDB.

Abstract:
Weakly supervised text-based person re-identification aims to retrieve specific pedestrians based on textual descriptions without identity labels available during training. This task remains challenging due to the inherent cross-modal heterogeneity and lack of identity annotations. There is a common issue of modality gap in vision language models, which in turn affects the performance of downstream tasks such as cross-modal retrieval and multimodal clustering. Specifically, in our research and experiments, we found that there is a problem of inter-modal misalignment between image and text modalities. However, existing methods rely on mutual enhancement strategies between image and text clustering, leading to the accumulation of clustering noise and affecting the final retrieval performance. To address this issue, we propose a Consensus Labelling: Prompt-guided Clustering refinement (CLPC) framework for weakly supervised text-based person re-identification. Specifically, we introduce a textual inversion network to learn a pseudo token that captures visual context, which is then integrated into natural language sentences as personalized textual prompt. To further improve clustering quality, we introduce a Nearest Neighbor-Guided Pseudo Label Mining (NGPM) method, which uses the clusters derived from personalized textual prompts to refine the clustering of image features. Additionally, we design a Dynamic Margin Triplet (DMT) loss, where the margin is adaptively adjusted using a sigmoid-based function to enhance the model’s ability to distinguish hard negative samples. We have also introduce a Normalized Distribution Matching (NDM) loss to minimize the KL divergence between the image-text matching scores and the normalized soft matching scores. The extensive experimental results on three public datasets have demonstrated the superiority of our method. Our code is available at https://github.com/LeviWeiZhi/CLPC

Abstract:
Blockchain technology improves supply chain management by ensuring the immutability of transaction records and facilitating process tracking. However, the transparency of blockchain raises significant privacy concerns, as sensitive information such as buyer and supplier qualifications, product specifications, and transaction amounts is often exposed. Compliance verification, which needs access to specific sensitive data for compliance checks, becomes challenging in blockchain-based privacy-preserving supply chains. This paper introduces ZKVeil, an innovative scheme utilizing zero-knowledge proof technology to maintain the confidentiality of sensitive information while ensuring compliance verification. Additionally, ZKVeil uses decentralized identifiers and verifiable credentials to ensure the authenticity of transaction data. A theoretical security analysis demonstrates the effectiveness of ZKVeil in safeguarding real sensitive data and ensuring compliance with regulations. To evaluate the performance of our scheme, we implement ZKVeil on a private blockchain of 100 nodes. Taking the shipbuilding supply chain transaction as an example, the experimental results demonstrate that ZKVeil incurs low gas consumption, execution time, and memory overhead.

Abstract:
Event camera-based person re-identification (Re-ID) effectively addresses the challenges faced by traditional Re-ID systems, such as privacy leakage, low-light imaging degradation, and motion blur. However, traditional Convolutional Neural Networks (CNNs) struggle to model long-range spatio-temporal dependencies, while the Transformer architecture encounters fundamental conflicts with second-order computational complexity and the high temporal resolution of event streams. Additionally, sparse data leads to wasted computational resources and diluted effective data. In contrast, the Mamba architecture, with its long-term modeling capability and linear complexity, is better suited for event stream data. Therefore, we innovatively explore the potential of VMamba in event camera-based person Re-ID; however, directly using VMamba does not fully leverage the temporal asynchronicity and spatial sparsity inherent in event data. To address this, we design a novel Sparse VMamba framework to construct a more robust spatio-temporal information extraction mechanism. First, we develop a Spatio-Temporal Information Modeling (STIM) module that simultaneously employs CNNs and Gated Recurrent Units (GRUs) for modeling spatial and temporal information. Then, we enhance the robustness of sparse data feature extraction using two strategies: on one hand, we utilize Anti-Noise Contour Enhancement (ANCE) module to improve motion contour features and mitigate sensor pulse noise; on the other hand, we implement Direction-Aware Sparse Perception (DASP) module to encourage the model to extract robust person descriptors. Results on the Event-ReID-v1 and Event-ReID-v2 datasets validate the effectiveness of our approach.

Abstract:
Cloth-changing person re-identification (CC-ReID) aims to match individuals wearing varying clothes across camera views. Existing CC-ReID methods typically focus on extracting clothing-invariant features such as body shape, pose, gait, etc. However, these features are diverse and often entangled with clothing-related visual clues, posing significant challenges for comprehensively and effectively separating and leveraging them for re-identification. To address these challenges, we propose a text-guided clothing generalizable (Tex-CG) model, which employs multi-modal large language models (MLLMs) to comprehensively and explicitly decouple clothing-invariant features from pedestrian images in the textual domain. By shifting feature disentanglement to the textual domain, the interference caused by visual entanglement between clothing and clothing-invariant clues can be significantly reduced. Additionally, to ensure compatibility between offline-decoupled features from MLLMs and our online-trained Tex-CG model, we utilize CLIP-based image-text matching to train implicit clothing-invariant prompts that embed discriminative pedestrian information. A dynamic fusion module is subsequently introduced to leverage these implicit prompts for selectively integrating valuable and compatible components from the MLLM’s explicitly decoupled features, constructing robust guidance to direct our model to effectively capture clothing-invariant visual clues for re-identification. Extensive experiments demonstrate the effectiveness of our method, and the Tex-CG model achieves state-of-the-art performance on five mainstream CC-ReID benchmarks. Our code is available at https://github.com/JiaoBL1234/Tex-CG

Abstract:
Concept drift refers to the deviation in data distribution over time, driven by dynamic changes in attackers or environments. This phenomenon poses a significant challenge for deploying machine learning models in cybersecurity. Existing approaches rely heavily on frequent retraining or distribution-level analyses, which are costly, labor-intensive, and often lack interpretability. To address these limitations, we propose DriftTrace, a novel system designed to detect, explain, and adapt to concept drift in security applications. Through comprehensive analysis, we uncover associations, consistencies, and diversities in security application features. Inspired by these findings, we detect drift at the sample level using a contrastive learning-based autoencoder, enabling fine-grained detection without requiring extensive labeling. For explanation, we employ a greedy feature selection strategy that links detection decisions to semantically relevant input features. To address data imbalance during adaptation, DriftTrace leverages sample interpolation techniques. We evaluate DriftTrace on Android malware datasets (Drebin and MalDroid2020) and a network intrusion dataset (IDS2018). Our system achieves an average detection F_1 score of more than 0.94, which is superior to the advanced baseline TRANSCENDENT, and improves the explanation fidelity by an average of 76% compared with CADE. These results highlight the practicality of DriftTrace for security scenarios.

Affiliations: School of Information Science and Technology and Shandong Key Laboratory of Deep Sea Equipment Intelligent Networking, Qingdao University of Science and Technology, Qingdao, Shandong, China; State Key Laboratory of Integrated Service Networks, Xidian University, Xi’an, Shaanxi, China; School of Computer Science and Information Engineering and the Key Laboratory of Knowledge Engineering with Big Data, Ministry of Education, Hefei University of Technology, Hefei, China; School of Artificial Intelligence and the School of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing, China

Abstract:
Secure Aggregation (SA) is a fundamental privacy-preserving technique in Federated Learning (FL) that ensures the confidentiality of local model updates while enabling global model aggregation. Previous studies have implemented SA within the FL architecture that includes a central server. However, in a Device-to-Device (D2D) based FL, decentralized SA becomes challenging due to the lack of a central server, particularly in a zero-trust network vulnerable to Byzantine attacks. To address this issue, we present a novel Byzantine-robust decentralized SA protocol (DeSA) that guarantees the integrity of model training and aggregation while protecting the privacy of model updates. Specifically, we utilize an enhanced zk-SNARK proof system to verify the local model training process. Additionally, we propose a framework that embeds multiple zero-knowledge proofs to ensure the integrity of model aggregation, while maintaining succinct proofs and fast verification. Moreover, we present a Byzantine-robust D2D aggregation protocol that can withstand malicious nodes trying to disrupt model aggregation. To protect privacy, we develop a one-time masking method that eliminates aggregated masks through a dynamic aggregation strategy. This strategy takes into account the adjacency and trust relationships among nodes in evolving network topologies. Finally, we perform a theoretical analysis and evaluate DeSA on real-world datasets. Experimental results show that the time required to verify an embedded proof is significantly reduced compared to the time of verifying multiple proofs. Additionally, its accuracy remains robust against malicious nodes.

Abstract:
This study considers covert and secure communications in which a transmitter, Alice, sends information to a legitimate receiver, Bob, using a differential space-time line code (Diff-STLC) scheme. A warden, Willie, attempts to detect Alice’s transmission based on signal strength. Upon successful detection, Willie becomes an eavesdropper (Eve) and starts to eavesdrop on the information. Willie and Eve are adversaries against covert and secure communications, respectively. To enhance the limited secrecy of a conventional Diff-STLC system, we propose a code-hopping strategy by randomly switching between two Diff-STLC structures. As a result, only Bob, aware of the hopping pattern, can decode the signals. To quantify both covertness and secrecy, we define a covert secrecy rate (CSR) as the difference between the achievable rates at Bob and Eve. Using the detection probability of Willie and the bit error rates of Eve and Bob, we derive an analytical lower bound for CSR. Both analytical and Monte Carlo results confirm that the proposed Diff-STLC hopping method significantly enhances the performance of CSR, thus improving the overall covertness and secrecy in communications.

Abstract:
Pseudonym Self-Generation (PSG) is essential for Conditional Anonymous Authentication (CAA) in Vehicular Ad-hoc Networks (VANETs), enabling vehicles to autonomously create communication identities without centralized control. However, existing self-generation schemes face two problems: efficiently revoking malicious vehicles remains difficult, and hardware-independent solutions additionally struggle to verify the legitimacy of pseudonyms. To overcome these issues, we propose an efficient Revocable CAA scheme with Verifiable Self-generated Pseudonyms for VANETs ( \mathsf RCAA_VSP ). First, a Traceable Chameleon Hash (TCH) function is designed to implicit membership through TCH collision, enforcing verifiable and unlinkable pseudonym generation. And then a multi-parameter authentication mechanism leveraging dynamic accumulators, which achieves O(1) -complexity revocation via domain membership polynomial updates without system-wide reconfiguration. Formal security analysis demonstrates the semantic security of our proposed scheme under the random oracle model, with proven resilience against common attacks. Experimental evaluations demonstrate that \mathsf RCAA_VSP achieves a 76% relative improvement in pseudonym generation efficiency, while maintaining the overhead of anonymous authentication and malicious vehicle revocation at ms-level, providing a lightweight solution for secure VANET communication.

Abstract:
Despite impressive capability in learning over graph-structured data, graph neural networks (GNN) suffer from adversarial topology perturbation in both training and inference phases. While adversarial training has demonstrated remarkable effectiveness in image classification tasks, its suitability for GNN models has been doubted until a recent advance that shifts the focus from transductive to inductive learning. Still, GNN robustness in the inductive setting is under-explored, and it calls for deeper understanding of GNN adversarial training. To this end, we introduce a concept of graph subspace energy (GSE)—a generalization of graph energy that measures graph stability—of the adjacency matrix, as an indicator of GNN robustness against topology perturbations. To further demonstrate the effectiveness of such concept, we propose an adversarial training method with the perturbed graphs generated by maximizing the GSE regularization term, referred to as AT-GSE. To deal with the local and global topology perturbations raised respectively by LRBCD and PRBCD, we employ randomized SVD (RndSVD) and Nyström low-rank approximation to favor the different aspects of the GSE terms. An extensive set of experiments shows that AT-GSE outperforms consistently the state-of-the-art GNN adversarial training methods over different homophily and heterophily datasets in terms of adversarial accuracy, whilst more surprisingly achieving a superior clean accuracy on non-perturbed graphs.

Abstract:
Confidential Compute Architecture (CCA) is the latest Trusted Execution Environment (TEE) system on Arm. It offers a VM-level execution environment designed to host applications that manage security-sensitive tasks and safeguard them from malicious system software. Although this VM-level design simplifies TEE adoption, it introduces a large attack surface. Attackers can break isolation by exploiting vulnerabilities in any component of the VM. In this paper, we present HiveTEE, a scalable intra-TEE isolation architecture that leverages Realm Management Extension (RME) and Memory Tagging Extension (MTE). HiveTEE allows developers to partition applications into multiple isolated domains (SDoms), preventing a compromise in one part of the application from propagating across the entire TEE. To evaluate the performance overhead introduced by HiveTEE, we apply it to three real-world applications: OpenSSL, SQLite, and Memcached. The evaluation results show that HiveTEE incurs a small performance overhead (<3%).

Abstract:
Advanced Persistent Threats (APTs) have become one of the most prominent cybersecurity risks globally. The external network-facing (ENF) services (e.g., e-commerce platforms) within a system are particularly vulnerable, as they are directly exposed to the Internet and often serve as the primary targets for attackers. By deploying deception resources to protect these ENF services, defenders can detect threats early, block potential attacks, and enhance overall system resilience. However, most existing studies on cyber deception strategies assume simultaneous moves by both attacker and defender. Furthermore, few works have considered the evolution of the system state resulting from APT attacks on the ENF services. To address these limitations, this paper proposes a Cyber Deception Stackelberg Markov Game (CDSMG) for protecting ENF services, which dynamically captures state transitions and accurately characterizes the strategic interactions between defenders and APT attackers. In CDSMG, the defender acts as the leader, who proactively selects a subset of services to deploy the deception resources based on the current system state, while the APT attacker plays as the follower, making a best response which incorporates the defender’s policy into its own strategy. To overcome the challenge of the combinatorial optimization problem of selecting a subset of services, we propose a revised version of the PPO algorithm by using no-replacement sampling to select multiple services at once, thereby significantly reducing the action space size. Finally, experimental results demonstrate that our approach effectively defends against APT attacks. It not only outperforms several baseline methods but also exhibits better scalability and robustness under varied model parameter settings.

Abstract:
Efficient and maliciously secure multiparty private set intersection (mPSI)—especially for variants enabling cardinality (mPSI-CA) or secret shared outputs with low communication overhead—faces ongoing challenges regarding performance and scalability. However, existing approaches are either based on strong security assumptions of non-colluding centers or are relatively expensive in terms of computational and communication overheads. Addressing these limitations, this paper introduces a new suite of protocols and formalizes them in a real/ideal model to prove the security of the protocols. Firstly, our core mPSI protocol achieves malicious security under the standard honest majority model, relying solely on symmetric-key primitives. The approach employs a lightweight, oblivious key-value store (OKVS)-based architecture where each non-pivot party sends only a single message to a designated pivot. This approach, inspired by Nevo et al. (CCS 2021), minimizes client overhead by carefully delegating core computations. We extend this framework to support cardinality (mPSI-CA) and secret sharing (mPSI-SS) functionalities, which require an additional non-collusion assumption among specific parties. We also introduce a Chinese Remainder Theorem (CRT)-based batching technique for parallel mPSI, achieving near-linear communication savings by compressing multiple OKVS structures. This method generally trades higher computational costs (from polynomial operations and CRT) for communication efficiency, but it is highly effective when communication is paramount or when batching numerous instances of small item sets, where encoding computations can be competitive. Finally, our implementation and evaluation of the proposed mPSI and mPSI-CA protocols in both LAN and WAN settings demonstrate their practical advantages. For instance, in the LAN setting depicted (15 parties, t=7, m=2^20 ), our mPSI protocol is 3.0× faster and uses 2.5× less communication than Nevo et al. (CCS 2021). Against the approach of Gao et al. (CCS 2024), our maliciously secure protocol is 1.4× faster with comparable communication overhead under weaker assumptions.

Abstract:
Gradient leakage attacks pose significant privacy risks in federated learning by exploiting transmitted gradients to reconstruct sensitive data. While existing defense mechanisms typically apply uniform perturbation across all gradients, we identify a critical oversight: privacy information in gradients exhibits inherent layer-wise heterogeneity. Through systematic analysis, we establish that different neural network layers contain varying amounts of reconstructable private information due to differential accumulation of nonlinear effects during gradient formation. This fundamental discovery enables our key innovation—Layer-Specific Gradient Protection (LSGP)—which pioneers surgical defense mechanisms that adapt protection intensity to each layer’s inherent privacy exposure level. Experimental results validate that LSGP achieves superior defense efficacy with comparable model utility compared to uniform protection baselines, establishing a new paradigm for efficient privacy-preserving machine learning through principled vulnerability analysis of gradient formation mechanics.

Abstract:
Integrating massive multiple-input multiple-output (mMIMO) systems with intelligent reflecting surfaces (IRS) presents a promising paradigm for enhancing physical-layer security (PLS) in wireless communications. However, deploying high-resolution quantizers in large-scale mMIMO arrays, along with numerous IRS elements, leads to substantial hardware complexity. To address these challenges, this paper proposes a cost-effective PLS design for IRS-assisted mMIMO systems by employing one-bit digital-to-analog converters (DACs). The focus is on jointly optimizing one-bit quantized precoding at the transmitter and constant-modulus phase shifts at the IRS to maximize the secrecy rate. This leads to a highly non-convex fractional secrecy rate maximization (SRM) problem. To efficiently solve this problem, two algorithms are proposed: 1) the WMMSE-PDD algorithm, which reformulates the SRM problem into a sequence of non-fractional programs with auxiliary variables using the weighted minimum mean-square error (WMMSE) method and solves them via the penalty dual decomposition (PDD) approach, achieving superior secrecy performance; and 2) the exact penalty product Riemannian gradient descent (EPPRGD) algorithm, which transforms the SRM problem into an unconstrained optimization over a product Riemannian manifold, eliminating auxiliary variables and enabling faster convergence with a slight trade-off in secrecy performance. Both algorithms provide analytical solutions at each iteration and are proven to converge to Karush–Kuhn–Tucker (KKT) points. Simulation results confirm the effectiveness of the proposed methods and highlight their respective advantages.

Abstract:
Malicious jamming presents a pervasive threat to the secure communications, where the challenge becomes increasingly severe due to the growing capability of the jammer allowing the adaptation to legitimate transmissions. This paper investigates the jamming mitigation by leveraging an active reconfigurable intelligent surface (ARIS), where the channel uncertainties are particularly addressed for robust anti-jamming design. Towards this issue, we adopt the Stackelberg game formulation to model the strategic interaction between the legitimate side and the adversary, acting as the leader and follower, respectively. We prove the existence of the game equilibrium and adopt the backward induction method for equilibrium analysis. We first derive the optimal jamming policy as the follower’s best response, which is then incorporated into the legitimate-side optimization for robust anti-jamming design. We address the uncertainty issue and reformulate the legitimate-side problem by exploiting the error bounds to combat the worst-case jamming attacks. The problem is decomposed within a block successive upper bound minimization (BSUM) framework to tackle the power allocation, transceiving beamforming, and active reflection, respectively, which are iterated towards the robust jamming mitigation scheme. Simulation results are provided to demonstrate the effectiveness of the proposed scheme in protecting the legitimate transmissions under uncertainties, and the superior performance in terms of jamming mitigation as compared with the baselines.

Abstract:
Voice authentication has been widely used on smartphones. However, it remains vulnerable to spoofing attacks, where the attacker replays recorded voice samples from authentic humans using loudspeakers to bypass the voice authentication system. In this paper, we present MagLive, a robust voice liveness detection scheme designed for smartphones to mitigate such spoofing attacks. MagLive leverages the differences in magnetic pattern changes generated by different speakers (i.e., humans or loudspeakers) when speaking for liveness detection, which are captured by the built-in magnetometer on smartphones. To extract effective and robust magnetic features, MagLive utilizes a TF-CNN-SAF model as the feature extractor, which includes a time-frequency convolutional neural network (TF-CNN) combined with a self-attention-based fusion (SAF) model. Supervised contrastive learning is then employed to achieve user-irrelevance, device-irrelevance, and content-irrelevance. MagLive imposes no additional burden on users and does not rely on active sensing or specialized hardware. We conducted comprehensive experiments with various settings to evaluate the security and robustness of MagLive. Our results demonstrate that MagLive effectively distinguishes between humans and attackers (i.e., loudspeakers), achieving an average balanced accuracy (BAC) of 99.01% and an equal error rate (EER) of 0.77%.

Abstract:
The CROSS Digital Signature Algorithm (DSA), currently a second-round candidate in the NIST standardization process for additional post-quantum digital signatures, offers compact public keys and strong security guarantees rooted in the code-based Restricted Syndrome Decoding Problem (R-SDP) and its variant R-SDP(G). Despite its strong theoretical foundation and practical significance, existing CPU-based implementations of CROSS exhibit evident performance limitations, while its potential for high-throughput acceleration on GPU architectures remains insufficiently investigated. In this work, we present X2O, the first systematically optimized GPU implementation framework for CROSS on NVIDIA GPUs. X2O introduces a novel cross-parallel architecture that integrates both horizontal and vertical parallelism to fully exploit the massive concurrency of modern GPU platforms. The framework incorporates a series of targeted optimizations, including fine-grained thread scheduling, optimized memory access patterns, hash function tuning, and GPU-efficient tree construction. Experimental results on a NVIDIA RTX 4090 demonstrate the efficiency of our design, achieving up to 1,082,904 signature generations and 1,589,595 verifications per second at NIST security level 1. Compared to the official AVX-optimized CPU implementation, our GPU-based approach achieves up to 120× speedup, establishing a new performance benchmark for CROSS and demonstrating the viability of high-throughput, post-quantum digital signatures on parallel computing platforms.

Abstract:
Generative language models (GLMs) are increasingly integrated into modern intelligent applications to power intelligent functionalities. Developers often fine-tune open-source GLMs on proprietary data and deploy them in real-world applications. In this paper, we reveal a novel model supply chain attack that exploits this workflow: by injecting backdoors into the source code of an open-source GLM, an adversary can induce the model to memorize fine-tuning data and later regenerate it via crafted prompts. We propose Lure, a new backdoor-based data recovery attack that exploits memorization capabilities of fine-tuned models. During fine-tuning, Lure stealthily injects unique and attacker-enumerable hash prompts, and incorporates a Position-Decay Weighted Aligned Cross-Entropy Loss into the original fine-tuning loss, strengthening the association between injected prompts and corresponding data samples for effective data recovery. To achieve stealthy and transparent attack injection, Lure employs a stealthy backdoor within the model’s source code, enabling automatic injection of hash prompts during fine-tuning and thus maintaining the user’s original fine-tuning workflow. Lure also proposes several optimizations to maintain minimal impact on the performance of the original task and external training state. Extensive evaluations demonstrate the remarkable efficacy of Lure, achieving a 45%-68% data recovery rate while maintaining the attack’s transparency, stealthiness, and showcasing its ability to evade existing defenses.

Affiliations: College of Computing and Data Science, Nanyang Technological University, Jurong West, Singapore; College of Computer Science, Chongqing University, Chongqing, China; College of Computer Science and Technology, Jilin University, Changchun, China; School of Electronics, Electrical Engineering and Computer Science (EEECS), Queen’s University Belfast, Belfast, U.K.; Department of Electrical and Computer Engineering, Auburn University, Auburn, AL, USA; Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, South Korea

Abstract:
The increasing saturation of terrestrial resources has driven the exploration of low-altitude applications such as air taxis. Low altitude wireless networks (LAWNs) serve as the foundation for these applications, and integrated sensing and communication (ISAC) constitutes one of the core technologies within LAWNs. However, the open nature of low-altitude airspace makes LAWNs vulnerable to malicious channel access attacks, which degrade the ISAC performance. Therefore, this paper develops a game-based framework to mitigate the influence of the attacks on LAWNs. Concretely, we first derive expressions of communication data’s signal-to-interference-plus-noise ratio and the age of information of sensing data under attack conditions, which serve as quality of service metrics. Then, we formulate the ISAC performance optimization problem as a Stackelberg game, where the attacker acts as the leader, and the legitimate drone and the ground ISAC base station act as second and first followers, respectively. On this basis, we design a backward induction algorithm that achieves the Stackelberg equilibrium while maximizing the utilities of all participants, thereby mitigating the attack-induced degradation of ISAC performance in LAWNs. We further prove the existence of the equilibrium. Simulation results show that the proposed algorithm outperforms existing baselines and a static Nash equilibrium benchmark, ensuring that LAWNs can provide reliable service for low-altitude applications.

Abstract:
With the open-source proliferation of cyber-attack technologies, attackers’ capacity to adapt existing strategies and exploit system vulnerabilities has markedly increased, resulting in a growing number of unknown cyber-attacks. Traditional detection approaches encounter two principal challenges: 1) the need for a large volume of labeled attack samples, which are difficult to acquire in practice for deep learning-based detection methods; and 2) limited effectiveness in identify- ing novel and previously unseen attacks, particularly those that are unknown. In this paper, we propose Exploratory Detection of Unknown Cyber-Attacks via Evolutionary Strategy and Machine Learning. Specifically, we first train kernel-based Ramp-OCSVM models on full features of known attacks to derive class-specific thresholds, while inferring unknown attack thresholds via Gaussian distribution. Subsequently, features of known samples are defined as “genes”, and evolutionary feature representations are produced through multi-strategy evolution mechanisms. These evolved features are then processed by the trained Ramp-OCSVM models in conjunction with the corresponding thresholds to distinguish known-attack variants from unknown samples. Finally, a Random Forest classifier is iteratively trained using the evolved features, and the optimal model from the iterative process is selected according to detection performance. Comprehensive experiments were conducted on authoritative benchmark datasets. The experimental results yield F1 scores of 82.70% and 87.64% for unknown attack detection under different experimental settings. In the few-shot learning scenario, the mean F1 scores increase to 99.84% and 95.80% for detecting known and unknown attacks, respectively. Notably, the proposed method achieved an F1-score of 88.06% in the few-shot multi-class unknown attack experiment. Compared with SOTA approaches, the proposed method improves the F1 score by 2.19%, and demonstrates a 53.99% higher F1 score than detection approaches based on GAN and VAE.

Abstract:
Domain gap often degrades the performance of speaker verification (SV) systems when the statistical distributions of training data and real-world test speech are mismatched. Channel variation is a primary factor causing this gap, including bandwidth changes, background noise and encoding, etc. Although various domain adaptation algorithms could be applied to handle this domain gap problem, most algorithms could not take the complex distribution structure in domain alignment with discriminative learning. In this paper, we propose a novel unsupervised domain adaptation method for speaker verification, i.e., Joint Partial Optimal Transport with Pseudo Label (JPOT-PL), to alleviate the domain mismatch problem. Leveraging the geometric-aware distance metric of optimal transport in distribution alignment and speaker consistency in speech distribution, we further design a pseudo label-based discriminative learning where the pseudo label can be regarded as a new type of speaker label derived from the optimal coupling. With the JPOT-PL, we carry out experiments on the SV channel and lingual domain adaptation with VoxCeleb, LibriSpeech, CNCeleb, and AISHELL-2. Experiments show our method reduces EER by up to 30% compared with several state-of-the-art domain adaptation algorithms.

Abstract:
The Domain Name System (DNS) is a critical internet infrastructure that translates human-readable domain names into machine-routable IP addresses. However, DNS is inherently vulnerable to manipulation, with hijacking attacks growing in both frequency and sophistication. Existing detection methods primarily rely on traffic analysis at specific network points. However, they suffer from limited coverage and low accuracy in complex environments, such as when CDN is employed. While recent approaches employ graph-based techniques, they still suffer from detection inaccuracy issues due to their failure to account for the complex interdependencies among multiple types of nodes. To address these limitations, we propose a novel heterogeneous graph-based detection framework. Based on the collected DNS records from distributed scanners, our method extracts activity and security features and constructs a heterogeneous graph to capture resolution patterns and cross-entity relationships. We further design a time-decay graph neural network TNHAN that enhances traditional Heterogeneous Graph Attention Networks (HAN) by dynamically weighting recent records. This network improves adaptability to legitimate DNS changes. For evaluation, we conduct experiments on real-world resolvers and domain datasets. Experiment results demonstrate the effectiveness of our method. Our method can achieve an F1-score of 0.96, outperforming the best baseline by 0.057 on average, and up to 0.113 under low label proportion. Moreover, we conduct several case studies on detected incidents, including cases related to geopolitical conflicts, censorship-related hijacking, and manipulation by malicious resolvers. These cases demonstrate the method’s effectiveness in identifying diverse hijacking behaviors in practice.

Abstract:
The increasing sophistication of image manipulation techniques demands robust forensic solutions that can both reliably detect alterations and precisely localize tampered regions. Recent Multimodal Large Language Models (MLLMs) show promise by leveraging world knowledge and semantic understanding for context-aware detection, yet they struggle with perceiving subtle, low-level forensic artifacts crucial for accurate manipulation localization. This paper presents a novel Propose-Rectify framework that effectively bridges semantic reasoning with forensic-specific analysis. In the proposal stage, our approach utilizes a forensic-adapted LLaVA model to generate initial manipulation analysis and preliminary localization of suspicious regions based on semantic understanding and contextual reasoning. In the rectification stage, we introduce a Forensics Rectification Module that systematically validates and refines these initial proposals through multi-scale forensic feature analysis, integrating technical evidence from several specialized filters. Additionally, we present an Enhanced Segmentation Module that incorporates critical forensic cues into SAM’s encoded image embeddings, thereby overcoming inherent semantic biases to achieve precise delineation of manipulated regions. By synergistically combining advanced multimodal reasoning with established forensic methodologies, our framework ensures that initial semantic proposals are systematically validated and enhanced through concrete technical evidence, resulting in comprehensive detection accuracy and localization precision. Extensive experimental validation demonstrates state-of-the-art performance across diverse datasets with exceptional robustness and generalization capabilities.

Abstract:
Recent studies have demonstrated that semi-supervised learning (SSL) is highly vulnerable to backdoor attacks, where adversaries can manipulate up to 90% of model predictions through just a tiny fraction of poisoned training data. Despite the widespread adoption of SSL in safety-critical applications, effective defenses against such attacks remain limited. In this paper, we present a comprehensive defense framework designed to protect SSL against sophisticated backdoor attacks. Our work begins with a systematic analysis of backdoor mechanisms in SSL from two critical perspectives: 1) how attackers establish persistent correlations between triggers and target classes; 2) how triggers are introduced and resist removal at the data level. Our investigation reveals that, unlike supervised learning, SSL backdoor attacks 1) uniquely exploit pseudo-labeling mechanisms to establish stronger trigger-target correlations, and 2) demonstrate remarkable resilience at the data level, with triggers potentially appearing in any frequency band (low, medium, or high). Based on these insights, we introduce Backdoor Invalidator (BI), a defense framework that integrates three novel techniques: complementary learning, trigger mix-up, and dual domain filtering, which collectively obstruct, dilute, and filter the influence of backdoor attacks in both feature learning and data processing. Through extensive evaluation against state-of-the-art attacks, BI significantly reduces the average attack success rate while maintaining comparable accuracy on clean data. We also provide theoretical guarantees for BI’s generalization capability and demonstrate its practical deployability as a plug-in component. The code of this work is available at https://github.com/wxr99/Backdoor_Invalidator4SSL

Abstract:
Reversible adversarial examples offer adequate protection against malicious deep model identification and analysis. However, current methods still face challenges in terms of transferability and robustness, limiting their practical applicability. We introduce a novel technique for generating reversible adversarial examples utilizing Self-Reversible Adversarial Patch (SRAP) to address this. This approach significantly enhances the transferability and robustness of reversible adversarial examples against standard image processing techniques and adversarial defense methods. Specifically, we present a method for crafting adversarial patches that are small, non-overlapping, and adaptively integrated into specific regions. These adversarial patches are seamlessly combined with a reversible data-hiding technique that relies on prediction error expansion, resulting in adversarial examples with superior robustness and transferability. Experimental results indicate that our method achieves a remarkable transferability rate of up to 90% or higher between different models. Additionally, it exhibits strong robustness against image processing methods and adversarial defense strategies. Furthermore, our adversarial examples demonstrate an impressive attack success rate of 88% on commercial APIs, highlighting the effectiveness and practicality of our approach.

Abstract:
The rapid advancement of automatic speech recognition (ASR) models has significantly bolstered their multilingual proficiency and robustness, amplifying concerns over user speech privacy. Attackers may use hidden microphones or network attacks to capture and transcribe sensitive user interactions. Whisper, a state-of-the-art (SOTA) multilingual speech recognition model, delivers exceptional transcription accuracy across diverse languages. However, its superior performance also extends privacy leakage risks to multilingual contexts. Previous privacy-preserving methods based on adversarial examples were primarily optimized for monolingual models, limiting their effectiveness in multilingual settings. Moreover, as these perturbation mechanisms were predominantly tailored for English, their transferability to other languages remains constrained. To address this vulnerability, we propose the Disentanglement-based Universal Adversarial Perturbation (DUAP), a privacy-preserving method designed to counteract the Whisper model. Unlike optimization-based approaches, DUAP embeds language-specific features in the latent space to generate robust adversarial perturbations, providing consistent protection across multiple languages and effectively mitigating privacy risks in multilingual contexts. The method employs a two-stage language attack: first, a Language Feature Disentanglement model disentangles and reconstructs language-specific features to produce adversarial examples (AEs); second, gradient-based optimization refines AEs to disrupt Whisper’s language identification module. DUAP’s perturbations, effective in physical and digital settings, achieve SNRs from 40 dB (lightest) to above 17 dB (strongest). Across three Whisper model sizes, DUAP yields WERs over 95% (English), 85% (other languages), and 87% (physical settings), maintaining above 96% under AAC (64, 72 kbps) and MP3 (32, 96 kbps) compressions.

Affiliations: College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China; School of Information Engineering, Yangzhou University, Yangzhou, China; State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR) and the Software School, Shandong University, Jinan, China; School of Computing and Communications, Lancaster University, Lancaster, U.K.; School of Computing and Information Technology, Institute of Cybersecurity and Cryptology, University of Wollongong, Wollongong, NSW, Australia

Abstract:
Graph Neural Networks (GNNs) have powerful representation capabilities for graph data, achieving excellent performance across various fields. Considering the scarcity of labels in real-world scenarios, graph self-supervised learning (GSSL) has gained increasing attention due to its ability to train without relying on labels. However, recent studies have revealed that GNNs are vulnerable to stealthy backdoor attacks in GSSL scenarios, enabling the encoder to learn backdoor features simply by injecting triggers. Existing graph backdoor defense methods mainly focus on supervised settings and cannot be directly transferred to self-supervised scenarios due to the lack of label guidance. To bridge this gap, we propose GDetox, the first backdoor defense approach against backdoored encoders in GSSL. GDetox aims to eliminate backdoor logic in encoders while maintaining the encoder’s original performance. Specifically, GDetox can purify the graph backdoor encoder based on the self-supervised distillation approach without relying on label information. Further, we introduce an adversarial contrastive learning that augments node representations without relying on labels to enhance teacher model performance, thereby improving distilled encoder performance. We evaluate the defense performance of GDetox on four node classifications and four graph classification datasets by comparing with four state-of-the-art (SOTA) defense methods against seven latest backdoor attack methods on GSSL. Extensive experiments demonstrate that GDetox far outperforms the SOTA defense methods, reducing the attack success rate to 4% with negligible degradation in encoder performance (within 2%) in both node-level and graph-level tasks.

Abstract:
While recent efforts in countering spoofing attacks on voice biometric systems have primarily focused on detecting synthetic speech, Physical Access (PA) attacks, such as audio replay, still pose a serious and unresolved challenge. This research gap has been mainly due to the lack of new, realistic speech corpora for training and testing effective and generalizable countermeasure systems. Given the difficulty in collecting actual audio samples from this kind of attack, simulation has been proposed as an alternative to provide audio replay training data. The objective of this work is the generation of a novel simulated database, called RIRplay, that is both realistic, in the sense of reproducing the actual spoofing process, and representative of a wide variety of possible acoustic contexts. Our results show that training with the RIRplay corpus reduces the Equal Error Rate (EER) by nearly 10 percentage points on the challenging ASVspoof 2021 evaluation set, from 36.89% to 28.04%, compared to models trained on the ASVspoof 2019 corpus, demonstrating significant improvements in out-of-domain generalization.

Abstract:
Electric vehicles (EVs) have become one of the promising solutions to the ever-evolving environmental and energy crisis. The key to the wide adoption of EVs is a pervasive charging infrastructure, composed of both the private/home chargers and the public/commercial charging stations. The security of EV charging, however, has not been thoroughly investigated. This paper investigates the communication mechanisms between the chargers and EVs, and exposes the lack of protection on the authenticity in the SAE J1772 charging control protocol. To showcase our discoveries, we propose a new class of attacks, ChargeX, which aims to manipulate the charging states or charging rates of EV chargers with the goal of disrupting the charging schedules, causing denial of service (DoS), or degrading the battery performance. ChargeX inserts a hardware attack circuit to strategically modify the charging control signals. We design and implement multiple attack systems, and evaluate the attacks on a public charging station and two home chargers using a simulated vehicle load in the lab environment. Extensive experiments on different types of chargers demonstrate the effectiveness and generalization of ChargeX. Specifically, we demonstrate that ChargeX can force a Tesla’s charging state to switch from “stand by” to “charging”, potentially leading to overcharging. Additionally, ChargeX can transition any charging state to an error state, effectively launching a DoS attack on Tesla. If deployed, ChargeX may significantly demolish people’s trust in the EV charging infrastructure.

Abstract:
Cyber Threat Attribution (CTA) involves identifying the perpetrator of a cyberattack by analyzing evidence from attack incidents. As the final step in attributing cybercrime, it plays a critical role in accountability and incident characterization. To improve identification accuracy, most existing CTA methods rely on deep learning and machine learning techniques, integrating heterogeneous knowledge graphs from diverse information sources to support criminal reasoning. However, for ongoing attacks, these post-hoc learned models may not provide intuitive or actionable investigative leads during the early stages of an incident, limiting their applicability in real-time CTA scenarios. To address this challenge, we propose ThreatMAMBA, a temporally robust reasoning and explanation framework for CTA. The method first constructs a heterogeneous information graph by extracting Indicators of Compromise (IOCs), Tactics, Techniques, and Procedures (TTPs), and temporal relationships from Cyber Threat Intelligence (CTI). It then encodes the knowledge graph by preserving semantically significant states using a state space selection mechanism. Finally, contrastive learning is introduced to partition and construct temporal knowledge graphs according to the chronological development of the incident, enabling accurate attacker identification. Experiments show that at event progression stages of 20%, 40%, 60%, 80%, and 100%, ThreatMAMBA improves the Macro F1 score by 6.94%, 5.09%, 9.33%, 10.70%, and 12.05%. This means ThreatMAMBA is a robust and reliable attribution during the early stage of attacks. Moreover, ThreatMAMBA can identify TTPs that positively correlate with specific attacker groups. The attacker behavior profiles generated by our system achieve a Jaccard similarity of 0.2991 with MITRE ATT&CK, outperforming Qi’anxin by 0.1836.

Abstract:
Malware widely adopts network traffic encryption techniques to conceal malicious activities. Recent research has demonstrated the effectiveness of machine learning (ML)-, deep learning (DL)-, and pre-training-based malware traffic detection methods. However, a vast majority of these methods rely on the collected complete traffic during the malware attack. While certain methods can operate on partial traffic, their detection accuracy often significantly decreases when the available data is restricted to the extreme early stage, where information is most sparse. In this paper, we propose DawnGuard, an effective early-stage encrypted malware traffic detection framework through multi-flow temporal graph learning. Specifically, based on the temporal packet density distribution analysis, DawnGuard innovatively proposes a self-adjusting data augmentation strategy for early-stage malware traffic, which can force the model to focus on the early-stage interaction phase with more distinguishable properties. Meanwhile, considering that temporal-topological correlations among multiple flows can provide more distinguishable properties in a malware attack, we further develop a temporal graph learning framework to extract features, which can form Multi-Flow Graph Features (MGF). By utilizing MGF, DawnGuard implements a Vision Transformer-based detection mechanism, enabling accurate and precise encrypted malware traffic detection with early-stage traffic by capturing both local and global contextual relationships. Extensive experiments with two real-world datasets demonstrate that DawnGuard outperforms the state-of-the-art (SOTA) methods in three typical scenarios: varying early-stage time windows, imbalanced data, and unseen malware detection. Particularly, DawnGuard achieves an average F1 of 95.11%, 8.7% higher than the SOTA method, by only utilizing the first 20% loading ratio of complete traffic.

Abstract:
Passive eavesdropping compromises confidentiality in wireless networks, especially in resource-constrained environments where heavyweight cryptography is impractical. Physical layer security (PLS) exploits channel randomness and spatial selectivity to confine information to an intended receiver with modest overhead. However, typical PLS techniques, such as beamforming, artificial noise, and reconfigurable intelligent surfaces, often require additional active power or specialized deployment and rely on precise time synchronization and perfect CSI estimation, which limits their practicality. Meanwhile, the role of ambient backscatter devices (AmBDs) in potentially strengthening the legitimate channel while limiting eavesdroppers in generalized wireless network settings has not been fully investigated. To this end, we propose AmbShield, an AmBD–assisted PLS scheme that leverages naturally distributed AmBDs to simultaneously strengthen the legitimate channel and degrade eavesdroppers’ reception without requiring extra transmit power and with minimal deployment overhead. In AmbShield, AmBDs are exploited as friendly jammers that randomly backscatter to create interference at eavesdroppers, and as passive relays that backscatter the desired signal to enhance the capacity of legitimate devices. We further develop a unified analytical framework that analyzes the exact probability density function (PDF) and cumulative distribution function (CDF) of legitimate and eavesdropper signal-to-interference-noise ratio (SINR), a closed-form secrecy outage probability (SOP), its high-SNR asymptote, and a secrecy diversity order (SDO). The analysis provides clear design guidelines on various practical system parameters to minimize SOP. Extensive experiments that include Monte Carlo simulations, theoretical derivations, and high-SNR asymptotic analysis demonstrate the security gains of AmbShield across diverse system parameters under imperfect synchronization and CSI estimation.

Abstract:
Vertical federated learning (VFL) enables multiple parties with non-overlapping features to collaboratively train a model without sharing raw data. However, its split-learning architecture makes VFL particularly vulnerable to targeted backdoor attacks, while also limiting access to raw features and complete model parameters. These constraints render defenses developed for horizontal federated learning inapplicable. Moreover, existing VFL defenses neglect the defender’s privileged role as the label holder, leaving them vulnerable to adaptive backdoor strategies. In this paper, we propose ConFirm, a post-training defense framework that leverages this privileged position to detect poisoned samples. Our approach is driven by a key insight: poisoned samples exhibit abnormally stable prediction confidence under model perturbations, an intrinsic property of backdoor attacks. By actively perturbing the top model and measuring the consistency of prediction confidence, ConFirm detects poisoned samples based on their perturbation robustness. Extensive experiments across multiple benchmarks show that ConFirm consistently outperforms state-of-the-art methods, achieving AUROC scores exceeding 99.9% and up to 21.2% higher F1-scores. Furthermore, ConFirm remains effective against adaptive attacks. We believe that our findings highlight a promising direction for backdoor defense in VFL, and pave the way for more secure VFL systems.

Abstract:
Cloud computing enables efficient resource sharing, but the shared environment exposes virtual machines (VMs) to the risk of side-channel attacks (SCAs). Adaptive VM migration mitigates such threats by dynamically reshuffling VM placements across servers to disrupt stable co-residency relationships. However, most existing methods adjust VM allocation based on coarse-grained environmental information and overlook user-level security attributes, thereby limiting their defensive effectiveness and potentially degrading service performance. In this paper, we propose an adaptive VM allocation framework driven by user threat assessment to enhance cloud security. By analyzing a large-scale real-world dataset from Microsoft Azure, we uncover VM usage patterns that correlate tightly with potential user threats and leverage these behavioral signals to guide security-aware initial VM placement. We further propose an adaptive multi-objective VM migration mechanism based on a non-dominated sorting snow ablation optimizer, which dynamically reshuffles VM allocations to continuously mitigate co-residency risks. Extensive experiments on real-world datasets demonstrate that the proposed framework reduces co-residency risks by up to 25% and defensive costs by up to 20%, while improving overall resource utilization by up to 30% compared with the previous related adaptive cyber defense (ACD) approaches.

Abstract:
Graph-based vertical federated learning (GVFL) enables an active party who owns a labeled graph to collaborate with passive parties who possess additional node features and edges to improve model performance. GVFL shares representations and gradients, allowing passive parties to retain their optimized bottom models, which makes previous GVFL algorithms unable to resist label inference attacks. However, most attacks assume that the attacker has access to the training data’s exact class space, the top model, or labeled auxiliary datasets from the training domain. These strong assumptions are not practical for real-world GVFL applications. In this paper, we propose Knowledge Transfer Attack (KTA), which leverages only auxiliary graphs from non-training domains to infer private labels. To address domain shift and ensure effective supervision transfer, KTA adapts a surrogate classifier in an aligned representation space while mitigating the negative influence of irrelevant outlier-class supervision. Specifically, KTA exploits the global consistency of cross-domain graphs and incorporates adaptive shift parameters into graph encoding. KTA then aligns cross-domain distributions within the shared class space and mitigates negative transfer by filtering outlier source classes. Experiments confirm the effectiveness of KTA in inferring the active party’s private labels and superiority over state-of-the-art attacks.

Abstract:
Industrial Internet of Things (IIoT) faces increasing security risks with wide applications. Compared with the Internet, the IIoT has a broader attack surface and unique structural characteristics, posing difficulties in directly transferring the previous vulnerability analysis techniques. This paper proposes a triple-stage vulnerability analysis framework (TS-VulA) for IIoT via attack graphs combining ModernBERT and multi-layer heterogeneous networks, which includes three stages. In the first stage, the SentenceBERT based on ModernBERT and IIoT disruption losses are combined to conduct a single-node vulnerability assessment from the aspects of likelihood and criticality of vulnerability exploitation. In the second stage, an IIoT device importance calculation based on multi-layer heterogeneous network theory is proposed, which can acquire the inherent relationships among various IIoT devices. Stage 3 extends attack graph rules for IIoT, then computes node priorities by integrating vulnerability assessment and device importance, which can guide the mitigation strategy. Extensive experiments demonstrate that the proposed vulnerability assessment method achieves an average accuracy and average precision of 87.86% and 87.33%, both exceeding the existing methods. Simulated case studies illustrate that the proposed TS-VulA outperforms the prevailing vulnerability analysis methods.

Abstract:
This article considers the dynamic time resource scheduling problem for a phased array radar performing joint search and tracking tasks in an adversarial jamming environment. Uniform resource scheduling in such conditions leads to security-critical failures, including the missed detection of weak threats and loss of tracking. To address this challenge, a multi-step dynamic optimization model with a receding-horizon strategy is formulated for joint search-and-tracking time resource scheduling under adversarial jamming. Then, a framework integrating Pontryagin’s maximum principle and the alternating direction method of multipliers (ADMM) is developed to solve the resulting problem. The maximum principle produces future-aware adjoint signals that quantify the future risk of jamming-induced sensing loss, while ADMM enforces per-stage feasibility and handles the non-convex optimization problem. Numerical simulations are conducted to verify that the proposed algorithm maintains higher tracking accuracy, stronger jamming resilience, and higher total utility than several comparison methods, thereby improving sensing robustness in adversarial conditions.

Abstract:
Deep neural networks for 3D point cloud analysis are widely used in applications such as autonomous driving and robotics, yet they remain highly vulnerable to adversarial attacks. Existing methods typically minimize point-wise distances to preserve geometry, which constrains perturbations and leads to a trade-off between imperceptibility and attack strength. To address this limitation, we propose a cage-based adversarial deformation framework that generates semantically consistent perturbations aligned with natural intra-class variations. Our method refines a source cage, predicts adversarial cage displacements by fusing source–target features, and computes smooth point-wise offsets using solid-angle–and distance-aware weights. This enables globally coherent deformations that appear natural to humans while effectively misleading classifiers. Extensive experiments on ModelNet40, ShapeNet-Part, and ScanObjectNN show that our approach achieves consistently high attack success rates while simultaneously improving point uniformity and reducing local geometric distortions. Furthermore, the perturbations remain effective against various defense methods.

Abstract:
Backdoor attacks targeting text-to-image diffusion models have advanced rapidly. However, current backdoor samples often exhibit two key abnormalities compared to benign samples: 1) Semantic Consistency, where backdoor prompts tend to generate images with similar semantic content even with significant textual variations to the prompts; 2) Attention Consistency, where the trigger induces consistent structural responses in the cross-attention maps. These consistencies leave detectable traces for defenders, making backdoors easier to identify. In this paper, toward stealthy backdoor samples, we propose Trigger without Trace (TwT) by explicitly mitigating these consistencies. Specifically, our approach leverages syntactic structures as backdoor triggers to amplify the sensitivity to textual variations, effectively breaking down the semantic consistency. Besides, a regularization method based on Kernel Maximum Mean Discrepancy (KMMD) is proposed to align the distribution of cross-attention responses between backdoor and benign samples, thereby disrupting attention consistency. Extensive experiments demonstrate that our method achieves a 97.5% attack success rate while exhibiting stronger resistance to defenses. It achieves an average of over 98% backdoor samples bypassing three state-of-the-art detection mechanisms, revealing the vulnerabilities of current backdoor defense methods. The code is available at https://github.com/Robin-WZQ/TwT

Abstract:
We study the problem of discrete distribution estimation under utility-optimized local differential privacy (ULDP), which enforces local differential privacy (LDP) on sensitive data while allowing more accurate inference on non-sensitive data. In this setting, we completely characterize the fundamental privacy–utility trade-off. The converse proof builds on several key ideas, including a generalized uniform asymptotic Cramér–Rao lower bound, a reduction showing that it suffices to consider a newly defined class of extremal ULDP mechanisms, and a novel distribution decomposition technique tailored to ULDP constraints. For the achievability, we propose a class of utility-optimized block design (uBD) schemes, obtained as nontrivial modifications of the block design mechanism known to be optimal under standard LDP constraints, while incorporating the distribution decomposition idea used in the converse proof and a score-based linear estimator. These results provide a tight characterization of the estimation accuracy achievable under ULDP and reveal new insights into the structure of optimal mechanisms for privacy-preserving statistical inference.

Abstract:
Skyline optimization is a powerful tool for filtering prominent data to support analysis and decision-making. However, traditional centralized skyline predicates are inadequate for contemporary data islands, and shallow data federation poses a threat to privacy with sensitive data. In existing distributed environments, achieving both efficiency and security in skyline computation remains a critical challenge. This paper addresses the challenge of performing secure skyline predicates on encrypted data federation while safeguarding both the dataset and skyline from unauthorized access. We propose a novel asynchronous structured skyline predicate based on vertical dominance and truth-value conversion, taking full advantage of distributed computing. Furthermore, we introduce a secure optimization that balances security and efficiency, thereby facilitating a distributed skyline predicate. We evaluate the efficiency and scalability across various parameters, demonstrating improvements in traversal overhead and expensive ciphertext operations.

Abstract:
Physical adversarial attacks (PAAs) pose a serious security threat to millimeter-wave (mmWave) radar systems used in safety-critical applications such as autonomous driving and security checking. These attacks, manipulating radar signals via specially crafted materials, are proven feasible and can cause severe sensing failures; however, effective defenses remain unexplored due to the difficulty of distinguishing adversarial examples from normal environmental objects. This paper presents mmGuard, a physics-based defense framework that addresses this challenge by exploiting a fundamental insight: the engineering process that makes materials adversarial inevitably creates detectable physical signatures. We identify three key domains where adversarial examples show artificial nature: spatial phase discontinuities, anomalous radar cross-section patterns, and violations of natural physico-kinematic relationships. mmGuard systematically captures these signatures through multi-domain feature extraction, enhances their discriminability via neural refinement, and enables efficient per-object attack detection and mitigation compatible with automotive radar update rates. To enable evaluation, we introduce mmAD, comprising over 110,000 annotated radar frames with diverse adversarial examples across realistic deployment scenarios. Experimental results demonstrate that mmGuard achieves over 90% detection accuracy while exhibiting strong in-distribution performance, with few-shot adaptation enabling calibration to unseen settings. Case studies further validate that mmGuard can reliably defend against PAAs in real-world settings.

Abstract:
Multi-factor authentication (MFA) has been widely applied in various fields, including smart homes, autonomous driving, and mobile communication. Although a number of MFA schemes with different security goals and properties have been proposed, most of them are found to pay little attention to password forgetting and loss issues, which may lead to the permanent loss of the account. Additionally, little effort has been devoted to designing MFA schemes with fine-grained access control to perform authentication flexibly. Therefore, the above issues raise the question of “how to construct a MFA scheme with dynamic password recovery and fine-grained access control?”. In this paper, we, for the first time, introduce attributes and a dynamic password recovery method to propose a multi-factor authentication scheme with dynamic password recovery and fine-grained control, named MFA-DPRF. In MFA-DPRF, authentication succeeds only when a user provides both a valid password and a set of attributes satisfying the specified access policy, thereby enabling fine-grained access control. Furthermore, a dynamic password recovery method based on secret questions and the secret sharing technique has been designed to address password forgetting and loss issues. Users can not only recover the original password locally, but also update secret values such as security questions to enhance security. The security of MFA-DPRF can be reduced to the computational Diffie-Hellman problem under the Random Oracle Model. We also analyze the security of MFA-DPRF in the universally composable framework to ensure composable security. The informal analysis proves that MFA-DPRF is secure against known attacks. Compared with the state-of-the-art works, performance analysis shows that MFA-DPRF is superior in security and efficiency.

Abstract:
In dynamic Windows malware detection, deep learning models are extensively deployed to analyze API sequences. Methods based on API sequences play a crucial role in malware prevention. However, due to the continuous updates of APIs and the changes in API sequence calls leading to the constant evolution of malware variants, the detection capability of API sequence-based malware detection models significantly diminishes over time. We observe that the API sequences of malware samples before and after evolution usually have similar malicious semantics. Specifically, compared to the original samples, evolved malware samples often use the API sequences of the pre-evolution samples to achieve similar malicious behaviors. For instance, they access similar sensitive system resources and extend new malicious functions based on the original functionalities. In this paper, we propose a framework MME(Mitigating the impact of Malware Evolution), a framework that can enhance existing API sequence-based malware detectors and mitigate the adverse effects of malware evolution. To help detection models capture the similar semantics of these post-evolution API sequences, our framework represents API sequences using API knowledge graphs and system resource encodings and applies contrastive learning to enhance the model’s encoder. Results indicate that, compared to regular Text-CNN, our framework can significantly reduce the false positive rate by 13.10% and improve the F1-Score by 8.47% on five years of data, achieving the best experimental results. Additionally, evaluations show that our framework can save on the human costs required for model maintenance. We only need 1% of the budget per month to reduce the false positive rate by 11.16% and improve the F1-Score by 6.44%.

Abstract:
Multi-view images are essential for modern radiance field reconstruction methods like Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS). While image watermarking is a crucial data protection and ownership verification technique, it faces unprecedented challenges in multi-view scenarios. Traditional 2D watermarking techniques often fail to maintain detectability in rendered views, while existing 3D watermarking methods are typically limited to specific reconstruction methods and require access to the reconstruction process. To address these limitations, we propose MantleMark, a watermarking framework that migrates watermarks from multi-view images to radiance fields via frequency modulation. Our key insight is constructing a mantle-like Frequency-domain Watermarking Representation in 3D frequency space, which can be projected to create view-dependent watermarking patterns. Relying upon the Fourier Projection-Slice Theorem, we embed these patterns through magnitude spectrum modulation in the image frequency domain, enabling watermarks to migrate into 3D representations. This approach ensures watermark detectability in rendered views regardless of the reconstruction methods used by adversaries. Extensive experiments demonstrate that our method achieves robust watermark detection while maintaining high visual quality across various radiance field-based reconstruction methods.

Abstract:
Automatic face recognition systems are widely used in different applications which require authentication. Among various types of attacks against face recognition systems, morphing attacks have become a major concern, where face images of two subjects are combined into a face morph image which is submitted for enrolment. In a successful attack, both contributing subjects can then authenticate against the morph reference. In this work, we propose a new method to generate face morphs based on inversion of the optimal morph embeddings. To this end, we first find the optimal morph embeddings using the face embeddings of two source face images and then use state-of-the-art template inversion techniques to generate the morph. We use three different template inversion methods: the first one exploits a fully self-contained embedding-to-image inversion model, while the second and third leverage the realistic image generation of a pretrained StyleGAN network and a foundation model based on diffusion models, respectively. Furthermore, we use optimization methods to improve the performance of template inversion methods in the generation of face morph images from optimal morph embeddings. In our experiments, we evaluate the performance of generated face morph images and compare them with state-of-the-art morph generation methods, showing the superiority of our method. We showcase that our method can outperform state-of-the-art deep-learning-based morph generation methods, both in white-box and black-box attack scenarios, and compete with state-of-the-art landmark-based morph generation methods. Moreover, we perform a practical print-scan attack to simulate a real-world scenario and compare our method with previous methods in the literature, demonstrating the effectiveness and superiority of our method. The source code of our proposed method and all experiments are publicly available.

Abstract:
With the increasing sophistication of malware, enhanced Attributed Control Flow Graphs (ACFGs) have become a fundamental representation and are widely applied in malware detection. However, existing CFG-based detection techniques primarily extract shallow features of malware, neglecting deeper structural and semantic characteristics. Additionally, retaining all basic blocks in CFGs significantly increases the memory overhead of detection models. To address these issues, we propose MCLPF, collaborative malware detection with interpretable pruning, to improve the overall performance of existing malware detection systems that rely on fine-grained control flow features. MCLPF first introduces a novel Attributed Interpretable Flow Graph (AIFG) to extract functional attributes, integrating node-level features, edge-level features, and assembly language embedding features derived from Large Language Models (LLMs). Subsequently, it proposes an efficient and reliable detection scheme by alternately updating the graph structure and language learning modules through L-Step and G-Step, rather than synchronously training Language Models (LMs) with Graph Neural Networks (GNNs) on large-scale graphs. We conduct experiments using public datasets involving four different architectures (i.e., PE-32, PE-64, ELF-32, and ELF-64) and demonstrate that our model achieves an exceptionally high detection accuracy (i.e., 99.30%). After pruning 100% of noncritical nodes and edges, the sample size is reduced to approximately 8% of the original, with an average time cost reduction of 74.7%, while the detection performance fluctuation averages only about 1%. Extensive cross-dataset evaluations validate the effectiveness and efficiency of the proposed method.

Abstract:
Machine learning malware detectors are vulnerable to adversarial EXEmples, i.e., carefully-crafted Windows programs tailored to evade detection. Unlike other adversarial problems, attacks in this context must be functionality-preserving, a constraint that is challenging to address. As a consequence, heuristic algorithms are typically used, which inject new content, either randomly-picked or harvested from legitimate programs. In this paper, we show how learning malware detectors can be cast within a zeroth-order optimization framework, which allows incorporating functionality-preserving manipulations. This permits the deployment of sound and efficient gradient-free optimization algorithms, which come with theoretical guarantees and allow for minimal hyper-parameters tuning. As a by-product, we propose and study ZEXE, a novel zeroth-order attack against Windows malware detection. Compared to state-of-the-art techniques, ZEXE provides improvement in the evasion rate, reducing to less than one third the size of the injected content.

Abstract:
Federated Learning (FL) offers collaborative model training across multiple decentralized devices without the need to share data directly, enhancing privacy and data security. However, FL systems are susceptible to backdoor attacks, where malicious clients inject poisoned weights during training. Existing defenses, primarily based on anomaly detection, are prone to erroneous rejections of normal weights while accepting poisoned ones, largely due to shortcomings in quantifying similarities among client models. Furthermore, other defenses demonstrate effectiveness only when dealing with a limited number of malicious clients, typically fewer than 10%. To alleviate these vulnerabilities, we present G2uardFL, a protective framework that translates the detection of malicious clients into an attributed graph clustering problem, thus safeguarding FL systems. Specifically, this framework employs a client graph clustering approach to identify malicious clients and integrates an adaptive mechanism to amplify the discrepancy between the aggregated model and the poisoned ones, effectively eliminating embedded backdoors. Through empirical evaluation, comparing G2uardFL with cutting-edge defenses, such as FLAME (USENIX Security 2022) and DeepSight (NDSS 2022), against various backdoor attacks, including 3DFed (SP 2023), our results demonstrate its significant effectiveness in mitigating backdoor attacks while having a negligible impact on the aggregated model’s performance on benign samples (i.e., the primary task performance). For instance, in an FL system with 25% malicious clients, G2uardFL reduces the attack success rate to 10.61%, while maintaining a primary task performance of 80.98% on the CIFAR-10 dataset. This surpasses the performance of the best-performing baseline, which merely achieves the attack success rate of 19.54%.

Abstract:
For providing timely warnings and preventing potential damages, it is crucial to detect anomalous actions that threaten public safety through surveillance cameras. Compared to normal actions, anomalous actions often occupy only a small portion of surveillance videos and exhibit more complex manifestations in terms of time and space. Considering that normal action recognition methods fail to highlight crucial information from small-sized patches, we propose the Spatio-temporal Key Patch Selection Network(STKPS-Net). It includes a Spatially Adaptive Key Patch Selection(SAKPS) module to select small but informative patches, and a Long-short Feature Map Spatio-temporal Relation(LFMSR) module to capture dynamic changes in anomalous actions. Additionally, a spatio-temporal refined loss is introduced to enhance fine-grained feature learning. Experimental results on the HMDB51, Kinetics, and UCF-Crime v2 datasets show that our STKPS-Net achieves state-of-the-art performance in few-shot anomalous action recognition, outperforming the most competitive methods by 1.2% on the anomalous action dataset UCF-Crime v2. More details can be found at https://github.com/xiaojs18/STKPS-Net.

Abstract:
Data Integrity Auditing (DIA) enables users to remotely verify whether their data saved in third-party clouds has been maliciously tampered with or compromised. As an extension of DIA in certificateless cryptography, certificateless DIA (CL-DIA) integrates the merits of conventional public-key cryptography (no key escrow) and identity-based cryptography (no certificates). However, CL-DIA schemes depend on a reliable third-party auditor (TPA) to perform integrity audits, inevitably suffering from performance bottleneck and single-point failure problems. Moreover, almost all current CL-DIA schemes were designed with computationally expensive bilinear pairings. Cryptanalysis demonstrates that the existing unique pairing-free CL-DIA scheme fails to achieve the unforgeable security of auditing proofs. In this work, we put forward a lightweight blockchain-assisted CL-DIA scheme. The scheme achieves DIA through the blockchain instead of a single TPA, thereby overcoming the problems caused by the TPA-based centralized auditing model. Then, by avoiding time-consuming pairing operations and employing edge servers in generating verifiable tags for the uploaded data of users, its performance surpasses previous pairing-based CL-DIA schemes, particularly in terms of computation efficiency. Furthermore, we provide formal proofs in the random oracle model demonstrating that our scheme achieves unforgeability of verifiable tags and auditing proofs, ensures data privacy secrity, and is resistant to collusion attacks between the EN and the CSP. Finally, experimental results show that when auditing 25 file blocks, our scheme only costs 0.29s, which reduces the total time cost of integrity auditing phase by 48.2%-85.5% compared to current pairing-based CL-DIA schemes.

Abstract:
Backdoor attacks pose severe security challenges to federated learning systems due to their stealthy nature. Existing detection methods primarily focus on identifying anomalies by analyzing discrepancies in client model updates. However, in federated learning, the non-independent and identically distributed (non-IID) nature of client data leads to inconsistencies among local model updates, which can mask the distinguishing features of backdoor attacks and consequently degrade the performance of detection methods. Unlike benign models, which are trained solely for a single classification task, backdoored models are simultaneously optimized for both the classification (main) task and the backdoor task. Therefore, training the backdoored models can be regarded as a multi-task learning problem. Inspired by information bottleneck theory, we observe that backdoored models exhibit more stable feature representations than benign models when performing the main task. Based on this insight, we propose a novel stability metric that quantitatively captures the disparity in feature map stability between backdoored and benign models. Leveraging this metric, we develop a new backdoor detection framework for federated learning. Our method computes anomaly scores for each client and selectively aggregates models with benign characteristics, effectively defending against backdoor attacks. We validate our approach through extensive experiments on multiple benchmark datasets under non-IID settings. The results demonstrate that our method consistently achieves high detection performance across a range of backdoor scenarios and data heterogeneity levels.

Abstract:
As a fundamental task in graph data analysis, degree statistic estimation serves as the foundation for many complex tasks. Local differential privacy (LDP) preserves the privacy inherent in raw degrees without a trusted third party. Existing methods struggle to balance different types of degree statistics. They either introduce excessive noise when estimating degree distribution due to not fully leveraging the properties of edge LDP, or are limited to polynomial statistic estimation only. We design a locally private framework for degree statistic estimation (LFS), using Laplace mechanism to provide appropriate privacy protection under edge LDP. LFS can estimate three types of degree statistics: polynomial, distribution and single-point. According to degrees with Laplace noise, we transform degree distribution estimation into a linear regression problem, then post-process the estimated distribution to mitigate the excessive smoothing introduced by the regularization term. We also achieve single-point statistic estimation considering the degree distribution and properties of Laplace noise. Systematic experiments on five datasets demonstrate that LFS consistently outperforms existing methods in four utility metrics.

Affiliations: School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Taipa, Macau, China; School of Information and Electronic Engineering, Zhejiang Gongshang University, Hangzhou, China; School of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing, China; Engineering Research Center of Digital Forensics of Ministry of Education and the School of Computer Science, Nanjing University of Information Science & Technology, Nanjing, China; Chinese Academy of Sciences, Institute of Computing Technology, Beijing, China

Abstract:
Ethereum 2.0 (ETH2) marks a pivotal shift in blockchain technology, transitioning from a Proof-of-Work (PoW) to a Proof-of-Stake (PoS) consensus mechanism, with Gasper at its core. While this evolution promises enhanced scalability and energy efficiency, the performance of its block proposal stage is highly sensitive to network latency and system parameters, such as slot length. This sensitivity introduces a critical trade-off between throughput and security, measured by the probability of blockchain forking. This paper reveals that network latency is not just a passive risk but an exploitable attack surface. We introduce the “adaptive latency-driven equivocation attack”, a novel adversarial strategy where an attacker deliberately creates forks while mimicking the behavior of a high-latency node, thus achieving plausible deniability. To formally analyze and quantify the impact of this threat, we develop a comprehensive theoretical model by using Markov chains to analyze the fork probability and throughput of the Gasper’s block proposal mechanism under both honest and adversarial conditions. Through extensive simulations, we validate the accuracy of our model in both normal and bursty traffic conditions. Our findings provide a systematic methodology for optimizing system parameters to achieve a robust balance between performance and security, offering a foundational guide for configuring ETH2 networks against sophisticated, latency-based threats.

Abstract:
Customized data sharing enables data owners to define access policies tailored to users’ specific preferences, while users can selectively acquire data of interest from designated owners. In cloud storage scenarios, outsourced data are encrypted and often governed by identical sub-policies that are frequently accessed by users. However, most existing schemes commonly suffer from limitations such as one-side access control, inefficient decryption, or privacy leakage, rendering them inadequate for effectively addressing these issues. In this paper, we propose a secure and customized data sharing scheme with identical sub-policy and bilateral access control (CSAC) for cloud storage. We leverage the technique of Secure Set Membership Test (SSMT) to enable bilateral access control, supporting privacy-preserving preference matching and customized data sharing. To improve the efficiency of data sharing and decryption, we design an attribute-based access control mechanism that enables users to locally store identical sub-policy parameters. By reusing these parameters in subsequent decryptions, CSAC eliminates redundant decryption operations and significantly reduces computational overhead. Security analysis demonstrates that CSAC is semantically secure under the chosen-plaintext attack model, preserving the confidentiality of shared data, user preferences, and preference matching information. Experimental results show that CSAC achieves nearly a 4× improvement in decryption performance compared with the state-of-the-art scheme, particularly when accessing a large proportion of data.

Abstract:
When it comes to the marriage of graph neural networks (GNNs) and model extraction attacks, the deployment of GNNs within Machine Learning as a Service (MLaaS) through a publicly pay-per-query API has opened up new attack surfaces. Existing defenses either sacrifice prediction accuracy or fail to thwart more advanced attacks. We investigate this dilemma and discover that fortified models with complex and narrow decision regions are difficult to be reproduced. Nevertheless, complex and narrow decision boundaries are prone to violate the subspaces of neighbor classes under the intrinsic coupling property of graph structure. Furthermore, class-wise representative features within the interior of class-wise subspaces endow the attackers with the capability of functionality replication. Here, we propose a novel model extraction defense, dubbed Decision Boundary-aware Counterfactual Learning (DBCL). DBCL proactively launches counterattacks on potential model extraction attacks, from the very beginning of sensitivity measurement that implicitly detect the malicious queries, such that class-wise representative features embodied in the highly sensitive query batch trigger the demand of worsening their query results unconsciously. Moreover, DBCL draws inspiration from counterfactual learning, aiming at finding the decision boundary-aware adversarial topology perturbations for ambiguously classified query samples, i.e., hard samples, to cross the decision boundary exactly, which introduces the tractive behaviour w.r.t. the inter-connected sensitive samples for class-wise ambiguous topology features. From the graph-structured actionable insights, DBCL innovatively finds the minimum perturbation sufficient for counterfactual learning, without jeopardizing the victim model’s predictive capacity by including confidently classified query samples, i.e., easy samples including sensitive and non-sensitive samples, into their correct classes. Empirically, DBCL shows its effectiveness in reducing the extraction accuracy of the SOTA model extraction attempt with different GNN backbone encoders in evaluating node classification performance. Moreover, we show that DBCL is robust to adaptive model extraction attacks.

Abstract:
Unsupervised Visible-Infrared Person Re-Identification (USL-VI-ReID) aims to match person images across visible and infrared modalities without identity annotations, addressing challenges such as cross-modal discrepancy and unlabeled data. Existing methods, however, often suffer from excessive sub-clusters, identity mixing, and unreliable cross-modal associations, which degrade matching performance. To overcome these issues, we propose MACHANet, a novel framework. The Memory Learning via Progressive Hybrid Clustering (MLPHC) module reduces excessive sub-clustering and enhances memory representations by first applying Harmonic Discrepancy Clustering with harmonic constraints and a core-edge mechanism, then gradually transitioning to DBSCAN as features become more discriminative. The Global Cross-Modal Positive Sample Alignment (GCPSA) module constructs a global set of cross-modal positive pairs, selecting the most similar visible–infrared samples of the same identity and computing alignment losses across intra- and inter-modalities. By maximizing mutual information and minimizing cross-modal distribution gaps, GCPSA effectively reduces modality discrepancies and suppresses noisy identity associations. Finally, the Multi-Modal Support Sample Expansion Alignment (MSSEA) module dynamically expands multi-modal support samples and incorporates residual-based representations to refine clusters, separate mixed identities, and progressively merge sub-identities. Extensive experiments on SYSU-MM01 and RegDB show that MACHANet outperforms existing state-of-the-art methods, including some supervised approaches. The source code will be publicly released.

Abstract:
The key management faces significant challenges in terms of efficiency, scalability, and quantum resistance in smart grid environments. This paper addresses these challenges by proposing a novel mutual authentication and key agreement (AKA) protocol for the Noisy Intermediate-Scale Quantum (NISQ) era. The protocol integrates quantum and classical information streams, enabling secure key agreement without third-party reliance. For the first time, quantum-based mutual identity authentication is applied to smart grid security, with a comprehensive analysis of resilience against various attacks, including man-in-the-middle, quantum computation, relay, impersonation, and DoS. Experimental results show that the scheme achieves a communication cost of 0.9 KB for 2 rounds, outperforming quantum-resistant schemes (2.5 KB) and performing similarly to classical schemes (1.2 KB). The scheme requires 2 quantum preparations and 1 measurement, offering higher efficiency compared to quantum-resistant methods. Compared to QKD and computational hardness-based models, our approach provides superior key management efficiency, practical implementation in non-fault-tolerant quantum systems, and enhanced security protection. Rigorous analysis and validation using IBM’s quantum-enabled environments demonstrate the robustness of the protocol in addressing modern smart grid security challenges.

Abstract:
Gait recognition has attracted increasing attention in both academia and industry as a non-intrusive human recognition technology from a distance without requiring cooperation. Triplet loss, which enforces relative distance constraints, is a fundamental component in gait recognition. Recently, several gait-specific triplet losses have been introduced to gait recognition. However, they only focus on sample selection and weighting to enhance constraints without exploring the gradient properties of Cosine/Euclidean metric, which fundamentally influence the model training efficiency and feature discriminability. In this paper, we theoretically analyze triplet loss gradients combined with weight decay and identify inherent limitations due to inadequate norm-control: Cosine metric triplet loss ( \mathcal L_cos ) exhibits excessive gradients resulting from small feature norms, while Euclidean metric triplet loss ( \mathcal L_euc ) suffers from a small margin-to-norm ratio due to large feature norms. To address these issues, we propose two norm-control approaches to constrain the feature norm in a stable range: 1) Norm-Variance-Regularized Collaboration. 2) Norm-Based Regularization. Extensive experiments show that our methods outperform state-of-the-art results under both Cosine and Euclidean evaluation metrics on three in-the-wild datasets: Gait3D, GREW, and BUAA-Duke-Gait. The code will be available at https://github.com/bgdpgz/TL-Gait.

Abstract:
Coordinated cyber-physical attacks (CCPAs) pose a critical threat to the secure operation of smart power grids. While existing studies often assume simultaneous or sequence-agnostic attack strategies, this paper proposes a sequence-aware CCPA framework that explicitly models the temporal coupling between cyber manipulation and physical sabotage. We enhance the classical load redistribution attack (LRA) by addressing four key limitations: detectability due to infeasible power flows, violation of power balance, omission of post-attack system response, and insensitivity to attack sequencing. Specifically, we formulate two distinct bilevel attack models, namely Cyber-to-Physical (C \rightarrow P) and Physical-to-Cyber (P \rightarrow C), and solve them via an exact KKT-based Mixed-Integer Linear Programming (MILP) reformulation and a scalable Benders Decomposition (BD) framework. Experiments on IEEE 14-, 57-, and 118-bus systems demonstrate that C \rightarrow P attacks induce significantly more line overloads than P \rightarrow C, validating the heightened risk of cyber-initiated cascades. Moreover, our BD approach accurately identifies spatial vulnerability hotspots with high fidelity, even when the physical attack budget is extended from R_p = 1 to R_p = 2 , confirming the framework’s scalability and practical relevance. The results provide actionable insights for adaptive grid protection against sophisticated, sequential threats.

Abstract:
Although encryption offers strong anonymity, it also facilitates the concealment of malicious activities, allowing adversaries to evade detection, and posing a great challenge to cybersecurity surveillance. Many existing encrypted traffic classification methods struggle to integrate flow- and packet-level tasks effectively, as they are trained independently, which is redundancy. Additionally, packet header and payload are treated equally, leading to the rich information in raw bytes remains fully unexplored, particularly in the abundant payload data. Moreover, they neglect the semantic invariance and common features between data samples, which ultimately results in suboptimal performance. To address these challenges, we propose an effective Multi-Task model using Dual Embedding and Graph Contrastive Learning (MT-DEGCL). Based on the byte-packet-flow structure of network traffic, a parallel dual embedding embeds the header and payload separately, followed by a cross-gated feature fusion strategy to capture the strong local packet-level representation. Then, we construct the traffic interaction graph and further utilize graph contrastive learning to extract the robust global flow-level representation. Finally, a multi-task model is trained for joint flow- and packet-level classification, leveraging the complementary learning between tasks to enhance overall performance. The experimental results on four real datasets highlight the effectiveness of MT-DEGCL, demonstrating superior performance in both tasks. Specifically, on the ISCX-Tor dataset, MT-DEGCL achieves F1 scores of 98.63% for flow-level classification and 98.10% at the packet level, surpassing the state-of-the-art (i.e., DE-GNN) by 2.03% and 83.21%, respectively. Furthermore, MT-DEGCL maximizes the rich information in raw payload bytes, significantly reducing or even nearly eliminating classification loss when using only payload data.

Abstract:
The increasing use of Voice over Internet Protocol (VoIP) technology in telecom fraud has become a serious global concern. Its ability to spoof caller IDs and IP addresses, and the use of overseas or anonymized servers make VoIP-based scams difficult to trace and regulate. As a result, distinguishing VoIP calls from conventional mobile phone calls based on voice signal characteristics is crucial for enhancing anti-fraud measures. However, existing forensic techniques often struggle to accurately identify speech transmitted via VoIP. To address this challenge, we propose a dual-level 1D-CNN that leverages both frame and utterance features for effective VoIP detection. After evaluating a range of acoustic features, we primarily focus on short-frame Mel-Frequency Cepstral Coefficients (MFCCs) due to their effectiveness in capturing VoIP characteristics. Given the frame-based processing and transmission nature of VoIP, we employ a 1D-CNN, rather than the more commonly used 2D-CNN that treats spectrograms as image, to extract frame-level codec features. Finally, we propose a dual-level classification strategy: the frame-level classifier captures encoding discrepancies within individual frames, while the utterance-level classifier aggregates these frame-level features to learn global encoding patterns through global covariance pooling. Experimental results on the VoIP Phone Call Identification Database (VPCID) demonstrate that the proposed method consistently outperforms existing approaches, delivering superior accuracy and robustness across a wide range of challenging scenarios. Moreover, comprehensive ablation studies validate the effectiveness and rationale behind the design of the proposed model architecture.

Abstract:
Guided by the principle of “Never Trust, Always Verify”, Zero Trust Architecture (ZTA) mandates continuous monitoring and analysis of users and entities, highlighting the critical role of behavior analytics. However, the growing volume of audit data and its complex contextual information render many existing behavior analytics methods insufficient. Moreover, most approaches rely on high-quality labeled data for supervised training, limiting their effectiveness against previously unseen malicious behaviors. To address these challenges, we propose the Large Language Model for Behavior Analytics (LLMBA) framework. LLMBA leverages a Large Language Model (LLM) to analyze behavioral patterns of internal users and entities, capitalizing on the LLM’s strong ability to model sequential data. We introduce a multi-level behavior encoding scheme to capture both contextual and temporal information from behavior records, producing rich input representations for the LLM-enhanced model. The LLM is fine-tuned using self-supervised learning, enabling the detection of unknown malicious behaviors. To reduce the computational and storage overhead inherent in LLMs, we apply knowledge distillation to compress the model while maintaining high detection performance. Extensive experiments on the CERT Insider Threat dataset demonstrate that LLMBA outperforms state-of-the-art baselines in detection accuracy. Furthermore, the compressed student model achieves superior performance compared with existing methods under comparable runtime constraints, making LLMBA highly suitable for real-world deployment.

Abstract:
Attribute-Based Encryption (ABE) enables fine-grained access control over outsourced data, but its key generation process typically requires users to disclose their complete attribute sets, introducing significant privacy risks. Existing privacy-preserving approaches—such as those based on zero-knowledge proofs or tightly coupled interactive protocols—suffer from limited scalability, high communication costs, and insufficient support for selective attribute disclosure. To address these limitations, we propose a privacy-enhancing key generation protocol guided by the principle of Minimal Disclosure, which ensures that users disclose only the minimally necessary subset of attributes required for authorization. Our protocol decouples attribute verification from key issuance: users first obtain cryptographically verifiable attribute tokens, and later issue blinded key requests over selectively chosen attributes. This design enables selective disclosure, supports reusable attribute credentials, and enhances user autonomy. To improve scalability, we introduce a lightweight batch verification mechanism that reduces computation and communication overhead for the attribute authority. We prove that our protocol achieves the binding and hiding properties under standard cryptographic assumptions, and we formally verify these guarantees in the symbolic model using the ProVerif tool. In addition, we propose two privacy metrics—Attribute Inference Gain (AIG) and Privacy Gain (PG)—alongside an entropy-based analysis to quantify resistance against attribute inference attacks. Experimental results show that our scheme effectively mitigates inference leakage while offering substantial efficiency gains compared to existing schemes.

Abstract:
Large Language Models (LLMs) have gained widespread use in various applications due to their powerful capability to generate human-like text. However, prompt injection attacks, which involve overwriting a model’s original instructions with malicious prompts to manipulate the generated text, have raised significant concerns about the security and reliability of LLMs. In this paper, we propose PromptFuzz, a novel testing framework that leverages fuzzing techniques to systematically assess the robustness of LLMs against prompt injection attacks. Inspired by software fuzzing, PromptFuzz selects promising seed prompts and generates a diverse set of prompt injections to evaluate the target LLM’s resilience. PromptFuzz operates in two stages: the prepare phase, which involves selecting promising initial seeds and collecting few-shot examples, and the focus phase, which uses the collected examples to generate diverse, high-quality prompt injections. By deploying the generated attack prompts from PromptFuzz in a real-world competition, we achieved the 7th ranking out of over 4000 participants (top 0.14%) within 2 hours, demonstrating PromptFuzz’s effectiveness compared to experienced human attackers. Additionally, we also deploy the generated attack prompts on 50 popular LLM-integrated online applications, including those from Coze and OpenAI, and found that 92% of them can be exploited by PromptFuzz. We also run PromptFuzz on 15 online LLM-based resume judging applications and found that 13 of these applications’ responses can be hijacked by PromptFuzz.

Abstract:
In the context of the Internet of Things (IoT), the large-scale generation and collection of data can greatly improve the quality of service provided, but they also raise significant concerns about privacy breaches. However, existing privacy-preserving data collection solutions based on local differential privacy (LDP) often struggle to balance security and accuracy when handling composite data types. To address this challenge, in this paper, we propose CSKV, a high-precision and privacy-preserving key-value data collection scheme. Specifically, we first design a padding and sampling protocol to improve data utility. Then, we propose two randomized response mechanisms to safely perturb keys and values in a cohesive and segmented manner. After that, by leveraging the sampling protocol and key-value correlation perturbation, we demonstrate that CSKV can provide secondary privacy amplification. Detailed theoretical analysis verifies the security and effectiveness of CSKV. In addition, extensive performance evaluations are conducted on synthetic and real-world datasets, and the results indicate that our proposed scheme outperforms existing schemes in terms of hit rate and estimation variance.

Abstract:
Clothing-changing person re-identification (CC-ReID) aims to address cross-camera person identification challenges caused by variations in pedestrian attire, making it a highly valuable research area within computer vision. Due to the high cost of labeling data, unsupervised learning methods have gained significant attention for CC-ReID tasks. However, existing unsupervised methods frequently suffer from high pseudo-label noise and reliance on complex preprocessing (e.g., human parsing) or multi-encoder architectures, resulting in increased computational overhead and deployment difficulties. To tackle these challenges, this paper proposes an end-to-end unsupervised CC-ReID framework named Multi-Scale Adaptive Clustering and Local Consistency Learning (MALC). Utilizing a single CLIP Vision-Encoder as its backbone, MALC discards such intricate procedures. Its core innovations include a multi-scale adaptive density clustering (MS-ADC) strategy to improve pseudo-label quality, and a local consistency learning (LCL) approach that imposes constraints on local region features to enhance robustness against clothing variations. Through the joint optimization of global and local losses, the model learns a highly discriminative and robust feature representation. Experimental results demonstrate that MALC, employing only RGB images, substantially outperforms comparable unsupervised approaches that rely on additional parsing information, showcasing notable advantages in identification accuracy and ease of deployment. The code will be made available at: https://github.com/ykding666/MALC

Abstract:
The widespread application of AIGC contents has brought not only unprecedented opportunities, but also potential security concerns, e.g., audio-visual deepfakes. Therefore, it is of great importance to develop an effective and generalizable method for multi-modal deepfake detection. Typically, the audio-visual correlation learning could expose subtle cross-modal inconsistencies, e.g., audio-visual misalignment, which serve as crucial clues in deepfake detection. In this paper, we reformulate the correlation learning with variational Bayesian estimation, where audio-visual correlation is approximated as a Gaussian distributed latent variable, and thus develop a novel framework for deepfake detection, i.e., Forgery-aware Audio-Visual Adaptation with Variational Bayes (FoVB). Specifically, given the prior knowledge of pre-trained backbones, we adopt two core designs to estimate audio-visual correlations effectively. First, we exploit various difference convolutions and a high-pass filter to discern local and global forgery traces from both modalities. Second, with the extracted forgery-aware features, we estimate the latent Gaussian variable of audio-visual correlation via variational Bayes. Then, we factorize the variable into modality-specific and correlation-specific ones with orthogonality constraint, allowing them to better learn intra-modal and cross-modal forgery traces with less entanglement. Extensive experiments demonstrate that our FoVB outperforms other state-of-the-art methods in various benchmarks.

Abstract:
Secure multi-party computation (MPC) over \mathbb Z_2^k offers better efficiency compared to computations over fields, and studying MPC under malicious security has more practical applications. Achieving malicious security with a dishonest majority over rings remains challenging. The most popular approach is \text SPDZ_2^k , however, it is a specific protocol and does not support transforming any existing semi-honest MPC protocols into the malicious security. The zero knowledge proof (ZKP) based compiler satisfy this requirement. Existing state-of-the-art protocols have a logarithmic online communication overhead in the circuit size |C| , and they direct application to rings is non-trivial as they are originally designed for finite fields.In this work, we revisit the question of the communication overhead to achieve malicious security. We bridge the gap between malicious security with abort and semi-honest security, by constructing a “GMW-style” verification protocol to achieve malicious security in a dishonest majority setting. It incurs a constant online communication overhead by enhancing the machinery of zero-knowledge fully linear interactive oracle proofs (zk-FLIOP). Also, we extend zk-FLIOP to work over any ring by invoking Reverse Multiplication Friendly Embeddings (RMFEs). Our result shows that the online communication complexity of verification only depends on the security parameter, the number of parties, and the ring size. Furthermore, for small-scale circuits over \mathbb Z_2 , we designed a distributed lookup table argument where both the total communication complexity and computational cost are independent of the circuit size but the input wires.

Abstract:
This paper addresses the spoofing detection issue for the Global Positioning System (GPS) based on radio frequency fingerprinting (RFF). We first introduce a new RFF feature in the post-despreading domain of GPS signal processing to better capture the hardware characteristics of GPS satellites. We model the new RFF feature as grayscale constellation images (GCIs) and provide in-depth analysis to show the hardware characteristics that can be captured by GCIs. Using this feature, we develop a deep-learning based spoofing detection framework named GCI-GANomaly, which applies a generative adversarial network (GAN) with a cosine latent anomaly scoring strategy for robust detection. We evaluate the detection accuracy and false positive rate (FPR) of the proposed method based on the open-source Texas Spoofing Test Battery (TEXBAT) dataset. The results showed that GCI-GANomaly improves the detection accuracy of traditional signal quality monitoring (SQM)-based methods by up to 30.8% and reduces the average FPR of existing RFF-based methods by 4.2% with much less training data while achieving slightly better average detection accuracy. We further evaluate the robustness (against spoofing power and time) of GCI-GANomaly based on a specific GPS signal dataset that we collected from the real-world constellation. The results showed that GCI-GANomaly achieves robust detection performance under varying spoofing power and shows acceptable stability as time elapses.

Abstract:
Current deep learning (DL)-based palmprint verification models rely on centralized training with large datasets, which raises significant privacy concerns due to the sensitive and immutable nature of biometric data. Federated learning (FL), a privacy-preserving distributed learning paradigm, offers a compelling alternative by enabling collaborative model training without the need for data sharing. However, FL-based palmprint verification faces critical challenges, including data heterogeneity from diverse identities and the absence of standardized evaluation benchmarks. This paper addresses these gaps by establishing a comprehensive benchmark for FL-based palmprint verification, which explicitly defines and evaluates two practical scenarios: closed-set and open-set verification. We propose FedPalm, a unified FL framework that balances local adaptability with global generalization. Each client trains a personalized textural expert tailored to local data and collaboratively contributes to a shared global textural expert for extracting generalized features. To further enhance verification performance, we introduce a Textural Expert Interaction Module that dynamically routes textural features among experts to generate refined side textural features. Learnable parameters are employed to model relationships between original and side features, fostering cross-texture-expert interaction and improving feature discrimination. Extensive experiments validate the effectiveness of FedPalm, demonstrating robust performance across both scenarios and providing a promising foundation for advancing FL-based palmprint verification research. The related code has been publicly available at https://github.com/Zi-YuanYang/FedPalm

Affiliations: Wireless Connectivity and Sensing Group, Barkhausen Institut, Dresden, Germany; Équipes Traitement de l'Information et Systèmes (ETIS), UMR , École Nationale Supérieure de l'Électronique et de ses Applications (ENSEA), Centre National de la Recherche Scientifique (CNRS), CY Cergy Paris University, Cergy, France; Department of Information Engineering, Universitá Politecnica delle Marche, Ancona, Italy; Technische Universität Dresden, the BMFTR Transfer Hub G-life, the Cluster of Excellence “Centre for Tactile Internet with Human-in-the-Loop (CeTI),”, Dresden, Germany

Abstract:
In this paper, we investigate the pertinence of the angle of arrival (AoA) as a feature for robust physical layer authentication (PLA). While most of the existing approaches to PLA focus on amplitude-dependent features of the physical layer of communication channels, such as channel frequency response, channel impulse response, or received signal strength, the use of AoA in this domain has not yet been studied in depth, particularly regarding the ability to thwart spoofing (impersonation) attacks. In this work, we demonstrate that an impersonation attack targeting AoA-based PLA is only feasible under strict conditions on the attacker’s location, which highlights the AoA’s role as a strong feature for unspoofable PLA, especially when 2D AoA is employed. We extend previous works considering a single-antenna attacker to the case of a multiple-antenna attacker, and we develop a theoretical characterization of the conditions under which a successful impersonation attack can be mounted. Furthermore, we have performed extensive simulations in support of theoretical analyses, to validate the robustness of AoA-based PLA.

Abstract:
Diffusion models have emerged as state-of-the-art generative frameworks, excelling in producing high-quality multi-modal samples. However, recent studies have revealed their vulnerability to backdoor attacks, where backdoored models generate specific, undesirable outputs called backdoor target (e.g., harmful images) when a pre-defined trigger is embedded to their inputs. In this paper, we propose PureDiffusion, a dual-purpose framework that simultaneously serves two contrasting roles: backdoor defense and backdoor attack amplification. For defense, we introduce two novel loss functions to invert backdoor triggers embedded in diffusion models. The first leverages trigger-induced distribution shifts across multiple timesteps of the diffusion process, while the second exploits the denoising consistency effect when a backdoor is activated. Once an accurate trigger inversion is achieved, we develop a backdoor detection method that analyzes both the inverted trigger and the generated backdoor targets to identify backdoor attacks. In terms of attack amplification with the role of an attacker, we describe how our trigger inversion algorithm can be used to reinforce the original trigger embedded in the backdoored diffusion model. This significantly boosts attack performance while reducing the required backdoor training time. Experimental results demonstrate that PureDiffusion achieves near-perfect detection accuracy, outperforming existing defenses by a large margin, particularly against complex trigger patterns. Additionally, in attacking scenarios, our attack amplification approach elevates the attack success rate (ASR) of existing backdoor attacks to nearly 100% while reducing training time by up to 20× .

Abstract:
Graph Neural Networks (GNNs) have achieved remarkable success in modeling structured data. Recent studies, however, reveal that they are highly vulnerable to backdoor attacks, which can implant triggers into training data to mislead predictions on nodes injected with triggers while maintaining accuracy on clean inputs. Despite recent advances, existing graph backdoor attacks often rely on explicit training interventions and substantial trigger injection while focusing solely on single-node misclassification, which limits their practicality in real-world deployments. To address these limitations, we propose a clean-label graph backdoor attack that induces one-hop neighborhood misclassification under a minimal trigger injection budget. Without altering target nodes’ features or labels, our method attaches a single trigger node to a target node, thereby misclassifying both the target and its immediate neighbors as the target class. To maximize effectiveness while preserving stealthiness, we propose a poisoned node selection strategy guided by semantic consistency and structural activeness, and design a conditional diffusion-based trigger generator optimized with multiple auxiliary objectives. Extensive experiments on multiple real-world benchmarks and mainstream GNN architectures show that our approach achieves over 95% attack success rate on both target nodes and their neighbors in most settings, including under state-of-the-art defenses. These findings underscore the urgent need for more robust graph learning systems and reveal novel attack surfaces in graph security.

Abstract:
Ultra-Wideband (UWB) technology has recently emerged as a transformative enabler of high-precision positioning systems. Despite its growing adoption across diverse applications, prior studies have claimed several successful distance deception attacks against UWB. To address heightened security concerns, companies like Apple introduce the ranging-awareness defense mechanism into their new version of the UWB interaction frameworks, which is proven to be effective against most known attacks. In this paper, we critically focus on the design flaws of state-of-the-art UWB interaction frameworks and propose Urey-ML, a novel machine learning-based UWB distance deception attack targeting UWB systems. To the best of our knowledge, this is the first attack capable of circumventing the defense mechanisms implemented in Apple’s UWB Nearby Interaction Framework (ANIF). Specifically, Urey-ML is built upon two critical breakthroughs. First, through network packet analysis, we discover that ANIF leaves a crucial message for key negotiation in an unprotected state. This vulnerability enables Urey-ML to bypass the encryption protection implemented by standard UWB systems. Second, to break the ranging-awareness defense, Urey-ML involves a reinforcement learning-based algorithm to optimize attack parameters. By leveraging this approach, Urey-ML can automatically and craftily generate attack signals that mimic the variations typically caused by normal human movement. Our experiments on commercial-off-the-shelf UWB products show that Urey-ML achieves centimeter-level UWB distance deception, with more than 25.79% signals circumventing the defense check of the victim device, which is only 0.56% (or failed) in prior works.

Abstract:
With the rapid advancement of cloud computing and the exponential growth of data, the demand for secure data querying and sharing has become increasingly prominent. Identity-Based Encryption with Keyword Search (IBEKS) and Hierarchical IBEKS (HIBEKS) address the issue of secure data querying, as it enables resource-constrained clients to effectively search for encrypted data stored in the cloud. However, existing HIBEKS schemes lack flexible data-query mechanisms among users in the same level, are highly vulnerable to attacks launched by quantum computers and Keyword Guessing Attacks (KGA), and lead to relatively high end-to-end latency. To meet more complex and diverse application requirements and address these vulnerabilities, we introduce a novel primitive called Hierarchical Identity-Based Puncturable Encryption with Keyword Search (HIBPEKS). The decryption keys held by higher-level users are capable of generating decryption keys for lower-level users, thereby enhancing the ability to perform multi-level encrypted data queries within the user group. In addition, to control the query of encrypted data, higher-level users can use specific tags to puncture decryption keys (for lower-level users), so that lower-level users will no longer be able to query the parts of the data associated with the punctured tags. Technically, we have improved the previous lattice-based IBEKS schemes and implemented an efficient and flexible data query mechanism in a hierarchical setting by exploiting Puncturable Encryption (PE) techniques. Moreover, we formalize the security model of HIBPEKS and prove its security within the framework of the random oracle model. Finally, we experimentally evaluate HIBPEKS and show that HIBPEKS is computationally efficient and practical.

Abstract:
This paper focuses on the research of anti-jamming issues for Low Earth Orbit (LEO) satellite constellations. Initially, the anti-jamming problem is modeled as a Local Interaction Markov Game (LIMG) and proven to be an Exact Potential Game (EPG), with at least one pure strategy Nash Equilibrium (NE) existing. Secondly, based on the theoretical analysis and the “offline training and online execution” architecture, a Distributed Multi-Agent Deep Reinforcement Learning-based anti-jamming (DMDRLA) scheme is proposed, and its convergence and asymptotic optimality are theoretically analyzed. Finally, simulations validate that the proposed DMDRLA scheme can effectively balance the training costs and performance optimization of the anti-jamming model, making it suitable for anti-jamming issues in resource-constrained LEO satellite networks.

Abstract:
Gait recognition offers non-contact, long-distance identification but struggles with robustness against covariates like clothing variations, carrying conditions, and viewpoint changes. Existing methods predominantly rely on single modalities (e.g., silhouettes or skeletons) or employ shallow multimodal fusion, such as simple concatenation, which treats modalities as independent and static, failing to exploit their complementary strengths, shape cues from silhouettes and structural kinematics from skeletons. To address these limitations, we introduce the Synergistic co-evolving representations (See) principle, enabling modalities to iteratively interact, guide, and refine each other across semantic hierarchies, fostering a unified, robust identity representation resilient to complex environments. This is realized through SeeGait, a novel multimodal framework featuring hierarchical multi-stage fusion. At its core, the Bidirectional Hierarchical Cross-Attention Synergy Module (BiHCASM) employs adaptive cross-modal attention to dynamically align and reweight features bidirectionally, allowing structural insights to enhance appearance focus and vice versa. Complementing this, the Hierarchical Spatiotemporal Transformer Encoder (HSTE) captures long-range skeleton dynamics, overcoming GCN limitations, while the Hierarchical Convolutional Silhouette Encoder (HCSE) extracts multi-scale silhouette pyramids for rich shape priors. Finally, a Holistic Feature Aggregation (HFA) strategy consolidates features from all stages for deep supervision, ensuring comprehensive optimization. By promoting mutual refinement, SeeGait mitigates covariate disruptions through enhanced complementarity, yielding superior discriminability. Extensive experiments show state-of-the-art performance, with 97.1% average Rank-1 accuracy on CASIA-B, and top results on CCPG and SUSTech1K. Code is available at https://github.com/benxianyeteam/SeeGait.

Abstract:
This paper focuses on the challenge-response physical-layer authentication (CR-PLA) scheme where a reflecting intelligent surface (RIS) is under the control of a receiving base station (BS) (Bob) who aims at checking if received messages come from a legitimate user equipment (UE) Alice or from an impersonating device (Trudy). To this end, Bob sets a random configuration of the RIS which remains secret to the attacker, and verifies that the channel estimated on the received message corresponds to the set configuration. We design the probability distribution of RIS configurations chosen by the verifier to maximize average capacity while satisfying an upper bound on missed detection (MD) probability for a given false alarm (FA) probability. The balance of communication and security metrics demonstrated by the numerical results shows the effectiveness and potential of the CR-PLA scheme.

Abstract:
Malicious traffic detection often requires large, labeled datasets, which are challenging due to privacy concerns, labeling costs, and evolving threat patterns. Although recent self-supervised pretraining methods address this issue, they rely on complex transformer-based architectures that are computationally expensive and have high inference times, making them unsuitable for real-time use. In addition, most existing approaches process packets or flows independently, and often rely on per-packet dataset splits that introduce implicit flow-level data leakage, thereby limiting their ability to capture meaningful semantic and behavioral relationships across flows for detecting stealthy encrypted threats. To address these issues, we propose LitCVit, a lightweight self-supervised contrastive Vision Transformer-based framework that captures cross-flow semantic and behavioral patterns to generate robust latent representations of encrypted traffic. Without relying on decryption or manually engineered features, our method enables efficient detection of encrypted malicious flows with low inference time. Extensive evaluations on benchmark datasets demonstrate that the proposed framework achieves an average detection accuracy of 98.10% and F1-score of 98.08%. Compared to the best state-of-the-art model, LitCVit achieves an average improvement of 2.49% in F1-score, 2.12% in precision, and 2.50% in recall, highlighting its superior detection capability in encrypted traffic scenarios. Additionally, LitCVit achieves an 8.7× reduction in inference time compared to the best existing self-supervised approach, making it highly suitable for deployment on resource-constrained devices.

Abstract:
Recently, a growing number of backdoor attacks have been implemented against vertical federated learning (VFL). To carry out such attacks, it is possible for an attacker to manipulate its local data and model, which are not accessible to the defender. This asymmetry complicates the design of defense mechanisms, resulting in a lack of effective defenses against backdoor attacks in VFL. In this paper, we propose a novel backdoor defense mechanism tailored for VFL. Specifically, we observe that the malicious embeddings provided by an attacker are often inconsistent with the embeddings provided by honest VFL participants and thus are less predictable from the latter. On the basis of this observation, we introduce a latent masked autoencoder (LMAE) to assess the semantic consistency of the embeddings provided by different VFL participants. On the basis of the LMAE, we further develop an algorithm to identify attackers and enable backdoor-resistant predictions. To evaluate the proposed defense, we conduct experiments involving four baseline defenses, seven backdoor attacks, and five datasets (CIFAR-10, CINIC-10, Yahoo Answers, BHI, and BM) of different modalities. The results show that our defense is effective at identifying attackers and achieves high prediction accuracy. Furthermore, we demonstrate that the defense is robust against various attacks in a wide range of scenarios.

Affiliations: Department of Cyber Security and Information Law, Chongqing University of Posts and Telecommunications, Chongqing, China; Department of Computer and Software Engineering, Huai’an University, Huaian, China; PLA Information Engineering University, Zhengzhou, China; Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy, Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen University, Guangzhou, China; Shandong Provincial Key Laboratory of Computer Networks, Qilu University of Technology, Jinan, China; Department of Cyber Security, Nankai University, Tianjin, China

Abstract:
Generator-based adversarial attack methods aim to fool deep neural networks (DNNs) by training a generator for crafting adversarial examples (AEs). However, as DNNs evolve from Convolutional Neural Networks (CNNs) to Transformers, the existing generator-based methods can hardly achieve satisfactory attack performance against different target model architectures in semi-whitebox attack scenarios. In addition, the generated AEs are susceptible to various distortions (especially for JPEG compression with low quality factors), which deteriorate the attack ability and increase the unreliability. To address these issues, we propose a dual-branch guided generative model called Heterogeneous Encoding Network (HENet) to form a robust generator-based adversarial attack framework. Specifically, our HENet introduces an Adaptive Feature Fusion Module (AFFM) to solve the dimensions and representativeness contradictions between CNNs and Transformers, which steers the perturbation generation based on a richer latent space and achieves better general attack ability. To further improve the robustness against JPEG compression, we design and integrate a Dynamic Differentiable JPEG Simulator (DDJS), which introduces an adaptive quantization mask to determine the flow of the gradient backpropagation in each frequency position. Extensive experiments prove the proposed method achieves a better attack success rate, lower perturbation magnitude, and higher robustness for various target network architectures under compressed, distorted, and lossless scenarios. Our codes will be made publicly available.

Abstract:
With the widespread availability of cloud computing and big data, homomorphic encryption has become a crucial technique employed to protect the data privacy in outsourced computation. However, the existing outsourced computation toolkits based on homomorphic encryption either require all participants to share the same key or result in enormous computational and communication overheads. In order to address these shortcomings, we put forward a toolkit for efficient and secure outsourced computation in multiple key scenarios (ESCM). ESCM can permit the servers to process the most frequently used arithmetic operations such as multiplication, division, sorting and so on across different encrypted domains. Moreover, to tackle the security concerns that may arise from collusion among some servers, as well as the problem of service disruption due to server outages, we propose the distributed two trapdoor cryptosystem with threshold decryption, the core cryptographic primitive, which is able to support (k,n) threshold decryption. Theoretical analysis validates the security of the proposed ESCM and compares the computation and communication complexity with existing most advanced solutions. Finally, simulation experiments illustrate the practicality and efficiency of ESCM.

Abstract:
Secure neural network inference is the privacy-preserving inference method that protects the model parameters and user’s private input. Previous works have constructed two-party, three-party and four-party secure inference schemes. However, these schemes allow only one corrupted party. Also, the interaction protocol between different parties is customized based on the number of participants. If the number of participants increases or decreases, the protocol needs to be redesigned. Another problem is that current protocols for non-linear functions still have large computation overhead. In this work, we present TFMD, a general and fast secure neural network inference framework with semi-honest security. TFMD is built based on threshold fully homomorphic encryption (FHE), and is suitable for the outsourced computation scenario. Concretely, TFMD designs general secure computation protocols for non-linear functions. Our protocols support arbitrary n participants, and allow at most n-1 corrupted parties. Further, TFMD constructs a novel secure neural network inference framework. TFMD employs FHE with computation-friendly coefficient encoding to quickly calculate linear functions, and employs our proposed protocol to calculate ReLU. Experiments illustrate that TFMD is both efficient and scalable. Even in the three-party setting, the online phase of our inference is 2.1× faster than CrypTFlow (S&P’20).

Abstract:
With the rapidly growing demand for collaborative data analysis, Jaccard Coefficient (JC) computation over multisets has been widely adopted in data deduplication to enhance large-scale data processing efficiency, but meanwhile incurs some security issues such as the leakage of input data sets. Thus, Secure Jaccard Coefficient Computation over Multisets (SCJM) schemes have been proposed. However, existing solutions that indirectly compute JC fail to protect the privacy of intersection and union cardinalities, incur high computational overhead, and rely on approximation techniques that cannot support high-precision analysis or simulation-based security proofs. To address these problems, we propose a secure and efficient protocol to accurately compute JC over multisets. Specifically, the protocol computes JC for small-scale data domains using secure oblivious ratio computation, ensuring that under the semi-honest adversarial model, the intersection and union cardinalities remain concealed from all parties, including the decryption key holder, during and after the computation. And it achieves a linear computational cost of \boldsymbol O(e_m-1) without accuracy loss, where \boldsymbol e_m-1 denotes the maximum number of repetitions of an element in the multiset. The protocol can also be extended to compute the cardinalities of intersection and union. To further enhance efficiency, we introduce a cloud-assisted encryption scheme, which improves computational efficiency by 25.5% to 30.4% compared with the non-cloud-assisted scheme. Additionally, we provide a secure proof of the proposed protocol in the ideal-real paradigm. Experimental results show efficiency advantages of our protocol over the state-of-the-art solution.

Abstract:
Deep learning-based image forgery localization models are increasingly deployed in real-world forensic services, yet their robustness against black-box adversarial manipulation remains insufficiently understood, calling for practical anti-forensics techniques to expose potential security weaknesses. Prior adversarial anti-forensics studies for forgery localization mainly assume white-box access, which limits their applicability to deployed systems where only hard, mask-like outputs are available and queries are tightly constrained. To bridge this gap, we propose AdvFor, a query-efficient black-box attack framework tailored for forgery localization with hard-label, mask-only, spatially dense binary feedback. AdvFor formulates the attacker–model interaction as a finite-horizon Markov Decision Process and learns a transferable attack policy from hard-mask feedback. Once trained, AdvFor can be deployed via fixed-length policy execution with only T=7 queries per image, avoiding per-image boundary refinement or query-dense direction/gradient estimation. The learned policy optimizes a structured objective—progressively suppressing forgery responses in the predicted localization mask so that the masks of forgery images approach an authentic-like (near-zero) output—while maintaining visual fidelity. Extensive experiments on six benchmark datasets and multiple modern forgery localization models demonstrate that AdvFor consistently achieves stronger attack performance than representative baselines under the same perturbation constraints, while operating in an ultra-low-query regime. We further validate AdvFor under common deployment-style defenses, showing its notable effectiveness in realistic settings.

Abstract:
Autonomous aerial vehicles (AAVs) are revolutionizing free-space optical (FSO) communication systems by enabling dynamic, flexible, and high-capacity connectivity in challenging environments. To enable multi-user communication in such environments, we incorporate rate splitting multiple access (RSMA) at the AAV to manage inter-user interference for downlink communication. In this context, Eve/Willie acts as an eavesdropper intercepting the transmission of FSO signal intended for Bob (B). We consider the combined influence of the generalized Málaga distributed atmospheric turbulence ( \mathcal M -AT), fog, non-zero boresight pointing errors (PEs), and angle-of-arrival (AoA) fluctutations of the AAV on the FSO channel. The covert performance of the considered AAV-assisted FSO communication system is analyzed by evaluating the series-based expressions for the detection error probability (DEP) and covert rate. Furthermore, we leverage the fractional equivocation and partial secrecy regime to analyze the physical layer security (PLS) by computing the series-based expressions for the generalized secrecy outage probability (GSOP), average fractional equivocation (AFE), and average information leakage rate (AILR) to quantify the considered AAV-assisted FSO system’s performance. To improve the robustness of the proposed system, we optimize the rate parameters to minimize the GSOP and maximize the throughput of the confidential message. Through extensive simulations, we evaluate the impact of disparate channel parameters on the considered system’s performance. Additionally, we investigate the impact of the confidential rate on the PLS of the considered system.

Abstract:
Federated Learning enables decentralized model training without exposing raw data, but remains fundamentally vulnerable to poisoning attacks from malicious clients. Existing defenses rely heavily on passive anomaly detection, honest majority assumptions, or unrealistic statistical priors, making them ineffective against adaptive and stealthy adversaries. In this paper, we propose SpecShield, a proactive defense mechanism that actively probes client models through calibrated adversarial perturbations. By leveraging the Fast Gradient Sign Method on the server side, SpecShield elicits dynamic response patterns from each client. These responses are then analyzed in the frequency domain using the Discrete Wavelet Transform. These frequency-domain features uncover distinctive response patterns between benign and malicious clients, enabling robust detection of model poisoning attacks in both non-IID environments and Byzantine majority scenarios. We further derive theoretical upper bounds on perturbation magnitudes to guarantee detection accuracy while preserving benign client performance. Through extensive experiments conducted on real-world datasets under six state-of-the-art poisoning attacks, SpecShield consistently outperforms existing defenses in both detection accuracy and model robustness. Our results demonstrate that active perturbation-induced profiling provides a new dimension for securing federated learning against sophisticated adversarial threats.

Abstract:
Recommendation systems based on graph neural networks (GNNs) have emerged as a promising paradigm due to their ability to capture high-order interactions between users and items. However, in federated scenarios, this advantage is compromised, as each user can access only a one-order subgraph composed of its directly interacted items. To address this issue, most existing solutions introduce a trusted server to assist users in expanding their local subgraphs. However, the server in reality is often untrusted and may deviate from the protocol for its own improper benefit. Furthermore, these solutions primarily focus on the privacy of items while neglecting the privacy of potential relationships between users. To this end, we propose Garland, a GNN-based federated recommendation scheme with malicious security. Garland departs from existing work by ensuring both item and relationship privacy while supporting integrity checks to defend against malicious servers. Specifically, we employ a trending cryptographic primitive of secret-shared shuffle to expand subgraphs in a privacy-preserving and verifiable manner. We also design a pre-shuffle triple-salt encryption mechanism and a post-shuffle user-governed expansion mechanism to reduce communication costs and achieve secure distribution of neighbor information, respectively. Moreover, we develop a secret-shared aggregation mechanism to enable privacy-preserving and verifiable federated training. Theoretical analysis demonstrates the privacy and integrity of Garland. Extensive experimental evaluations on four datasets show that Garland outperforms state-of-the-art solutions.

Abstract:
Backdoor attacks pose a critical threat to the security and reliability of deep neural networks (DNNs), enabling malicious triggers to manipulate model predictions while maintaining normal performance on clean inputs. Addressing this challenge requires robust defense mechanisms that eliminate backdoors without compromising model utility. This paper introduces PVDI (Preserving Vital and Disrupting Irrelevant attentions), a novel blind purification method designed to neutralize backdoors while retaining the primary prediction capabilities of both backdoored and clean models. PVDI leverages vital and irrelevant attentions during fine-tuning: vital attention is preserved to maintain the model’s core functionality, while irrelevant attention is disrupted to neutralize backdoor behavior. Extensive evaluations across diverse datasets and attack scenarios demonstrate PVDI’s superior purification performance, achieving significant reductions in attack success rates while preserving the utility of backdoored models and minimizing adverse impact on clean models. PVDI outperforms existing state-of-the-art defenses and sets a new benchmark for backdoor defense. This work represents a significant step forward in combating backdoor attacks.

Abstract:
Electronic Health Record (EHR) has improved medical data management efficiency through cloud-based storage and sharing. However, storing sensitive EHR data in third-party clouds introduces serious security and privacy risks. Registered attribute-based encryption (RABE), building on the advantages of traditional attribute-based encryption, fundamentally addresses the key escrow problem by radically changing the trust model. As an emerging cryptographic primitive, RABE provides a promising foundation for key-escrow-free secure data sharing. Nevertheless, the lack of dynamic permission updates and high decryption overhead hinder the practical adoption of existing RABE schemes. Motivated by these challenges, we first propose an RABE-based access control scheme for medical cloud environments. The scheme supports dynamic ciphertext updates to accommodate permission changes without frequent re-encryption and enables outsourced decryption to reduce user-side computational overhead. On this basis, we further propose a scheme with consistency verification to ensure the correctness of updated ciphertexts. Security analysis and experimental results demonstrate that the proposed schemes effectively preserve data privacy, enhance access control flexibility, and show practical potential for secure data sharing in real-world medical cloud environments.

Abstract:
Deep hashing models have been widely adopted to tackle the challenges of large-scale image retrieval. However, these approaches face serious security risks due to their vulnerability to adversarial examples. Despite the increasing exploration of targeted attacks on deep hashing models, existing approaches still suffer from a lack of multimodal guidance, reliance on labeling information and dependence on pixel-level operations for attacks. To address these limitations, we proposed DiffHash, a novel diffusion-based targeted attack for deep hashing. Unlike traditional pixel-based attacks that directly modify specific pixels and lack multimodal guidance, our approach focuses on optimizing the latent representations of images, guided by text information generated by a Large Language Model (LLM) for the target image. Furthermore, we designed a multi-space hash alignment network to align the high-dimensional image space and text space to the low-dimensional binary hash space. During reconstruction, we also incorporated text-guided attention mechanisms to refine adversarial examples, ensuring them aligned with the target semantics while maintaining visual plausibility. Extensive experiments have demonstrated that our method outperforms state-of-the-art (SOTA) targeted attack methods, achieving better black-box transferability and offering more excellent stability across datasets. The code is available at https://github.com/Raineasy/DiffHash

Abstract:
Reliable encrypted traffic classification is crucial for fine-grained and efficient network security management, enabling accurate user behavior recognition and cybercrime forensics. While AI-based methods can automatically extract subtle features from traffic data, existing approaches often fail to effectively capture and integrate features across different levels of traffic granularity, namely the byte, packet and flow levels. Current graph-based methods heavily rely on manual feature engineering to construct global IP-based graphs, overlooking critical packet-level temporal features and byte-level raw information. Focusing on only one or two levels of traffic granularity is unreliable and insufficient, ultimately compromising model accuracy and robustness. To address these limitations, we propose BPF-DAG, a byte-packet-flow feature fusion framework based on dynamic attributed graphs, for reliable encrypted traffic classification. To the best of our knowledge, this is the first method that integrates temporal packet relations into flow interaction patterns while directly leveraging raw byte-level data. Specifically, we introduce a multi-granularity feature fusion strategy that dynamically updates an IP-based graph by iteratively assigning edge attributes derived from evolving flow representations. During the joint training of the Transformer and the graph neural network, temporal representations are learned from raw packet sequences and reflected in edge attributes dynamically for further message aggregation. Experiments on the ISCX VPN-nonVPN, Tor-nonTor, MIRAGE-2019 and MIRAGE-2024 datasets show that BPF-DAG outperforms recent state-of-the-art methods in terms of classification performance.

Abstract:
To defend against adversarial structural attacks on graphs, we analyze attacks through the lens of mutual information and discover the “pairwise effect”. This effect reveals that structural attacks effectively degrade the performance of victim GNNs when these GNNs receive the modified structure paired with the given node attributes as training input. Therefore, we propose a novel defense strategy that renders structural attacks ineffective by disrupting the pairing of modified structures and node attributes during the training of victim GNNs, which we call “disrupting the pairwise effect”. To implement this idea, we propose two simple yet effective training strategies: Structural Fine-Tuning (SF) and Progressive Structural Training (PST), which disrupt the pairwise effect through node attributes pre-training followed by structure fine-tuning and progressive structure training, respectively. Compared to existing robust GNNs, our strategies avoid time-consuming techniques, thereby improving the robustness of GNNs while enhancing training speed. Additionally, these strategies can be easily applied to a wide range of commonly used GNNs, including robust GNN variants, making them highly adaptable to different models and applications. We provide theoretical analysis of the proposed training strategies and conduct extensive experiments on various datasets to demonstrate their effectiveness. Datasets and codes of this paper are available at https://github.com/Xing-Ai1003/Revisiting-Adversarial-Robustness-of-GNNs

Abstract:
With the increasing popularity of voice-centric applications, acoustic eavesdropping attacks pose a significant threat to user privacy. Although smartphones require explicit user permission to access the microphone, such attacks can bypass this restriction by exploiting power consumption data through compromised power supplies, such as USB adapters, public charging stations, and power banks. However, previous attempts can only recognize a limited set of hotwords or digits. To address this limitation, we introduce PowerEar, an acoustic eavesdropping attack that leverages the power side channel to reconstruct any audio reproduced by the built-in loudspeaker of a mobile device with an unconstrained vocabulary. Our approach relies on a combination of signal processing and generative techniques to learn the mapping between power consumption and audio playback, enabling the reconstruction of such audio through spectrogram enhancement. To validate the effectiveness of PowerEar attack, we carry out a comprehensive set of experiments using audio samples from various public personalities. Our results obtained through objective and subjective evaluations clearly demonstrate that PowerEar can successfully recover user speeches from power consumption data in comprehensive realistic settings, including speech utterances of individuals and different devices, mobile operating systems, activities, charging technology, battery, and volume levels.

Affiliations: School of Intelligence Science and Technology, Nanjing University, Suzhou, China; School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China; School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China; Engineering Research Center of Digital Forensics, Ministry of Education, Nanjing University of Information Science and Technology, Nanjing, China; State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), Center for Research on Intelligent Perception and Computing (CRIPAC), Institute of Automation, Chinese Academy of Sciences (CASIA), Beijing, China

Abstract:
Adversarial Training (AT) has been shown to significantly enhance adversarial robustness via a min-max optimization approach. However, its effectiveness in video recognition tasks is hampered by two main challenges. First, fast adversarial training for video models remains largely unexplored, which severely impedes its practical applications. Specifically, most video adversarial training methods are computationally costly, with long training times and high expenses. Second, existing methods struggle with the trade-off between clean accuracy and adversarial robustness. To address these challenges, we introduce Video Fast Adversarial Training with Weak-to-Strong consistency (VFAT-WS), the first fast adversarial training method for video data. Specifically, VFAT-WS incorporates the following key designs: First, it integrates a straightforward yet effective temporal frequency augmentation (TF-AUG), and its spatial-temporal enhanced form STF-AUG, along with Fast Gradient Sign Method (FGSM) to boost training efficiency and robustness. Second, it devises a weak-to-strong spatial-temporal consistency regularization, which seamlessly integrates the simple TF-AUG and the more complex STF-AUG. Leveraging the consistency regularization, it steers the learning process from simple to complex augmentations. Both of them work together to achieve a better trade-off between clean accuracy and robustness. Extensive experiments on UCF-101 and HMDB-51 with both CNN and Transformer-based models demonstrate that VFAT-WS achieves great improvements in adversarial robustness and corruption robustness, while accelerating training by nearly 490%.

Affiliations: School of Cyber Engineering, Xidian University, Xi’an, China; Guangzhou Institute of Technology, Xidian University, Guangzhou, China; School of Cyberspace Security, Beihang University, Beijing, China; State Key Laboratory of Integrated Services Networks, School of Cyber Engineering, and the Engineering Research Center of Big Data Security, Ministry of Education, Xidian University, Xi’an, China; College of Cyber Security, Jinan University, Guangzhou, China; School of Information Systems, Singapore Management University, Bras Basah, Singapore

Abstract:
With the widespread use of encrypted spatial data, many range query schemes emerge to address potential security risks caused by access pattern leakage. However, most existing schemes rely on a dual-server model to hide access patterns and often involve complex spatial relation judgments during range comparisons, leading to low query efficiency. To address these issues, we propose a novel Fast and Access Hidden Range Query (FAHRQ) scheme. First, we introduce an efficient range membership verification technique based on Bloom filters and Lagrange interpolation function, combine homomorphic encryption to ensure the confidentiality of spatial data and the computational flexibility of related operations, and realize the access pattern hidden under single server. Then, we construct an index using R-tree and employ Bloom filters and prefix 0-1 encoding to accelerate the minimum bounding rectangle intersection judgment, enabling secure and efficient range queries over encrypted spatial data while maintaining retrieval accuracy. Finally, we give a formal security analysis to show that our scheme achieves access pattern hidden while protecting data security, and conduct extensive experiments to demonstrate that our scheme improves query efficiency by 5-7× compared to existing schemes.

Abstract:
Non-contact palm-vein recognition has been widely adopted in security-critical applications owing to its contact-free acquisition paradigm and exceptional discriminative power. However, these systems remain susceptible to a spectrum of presentation attacks (PAs), creating significant security risks that require urgent mitigation. Progress on palm-vein anti-spoofing is currently impeded by three fundamental gaps: 1) the absence of an open-source, end-to-end pipeline for preprocessing liveness-detection data; 2) a lack of publicly available datasets specifically tailored to anti-spoofing evaluation; and 3) the unavailability of benchmark studies employing standardized protocols and reference implementations. In light of these key issues, we make the following four contributions. Firstly, we propose a new open-source preprocessing pipeline that can significantly improve model performance, reducing errors by up to 53.3%. Secondly, we introduce PVASD, a new largest known dataset comprised of 1,187,519 images belonging to 5,515 subjects, which consists of 880,241 live palm vein images along with 307,278 spoof attack images including 16 different attack types–both 2d (printed stuff) and 3d (gloves and prosthesis models) attacks captured under a variety of environments using off-the-shelf commercial-grade sensors at five resolutions. Lastly, comprehensive benchmarks are also created by evaluating three types of representative methods namely classical image classification models, face anti-spoofing methods adapted from face domain and anomaly-detection-based approaches, while our experimental results reveal unique characteristics intrinsic only to palm vein spoof attacks, which will hopefully provide valuable guidance to researchers for further investigation. Furthermore, we also expanded PVASD by adding 20,000 spoof samples generated by artificial intelligence, and evaluated the vulnerability of the existing models to deepfake attacks. We will release our preprocessing pipeline, dataset, and benchmark codes at https://github.com/valhongli/PVASD to advance future reproducible studies and accelerate palm-vein anti-spoofing algorithmic research.

Abstract:
Machine learning (ML) is highly effective for accurate encrypted malicious traffic identification by using high-quality training data. In fact, obtaining such data is costly and challenging. As a result, many ML-based models are inevitably trained on low-quality data and perform poorly. To enhance performance, some methods utilize various sample selection techniques to choose confident samples for model training. However, they often rely on a single metric for this selection, which restricts their adaptability across diverse datasets and noise conditions. In this paper, we propose a robust framework BAPTISM for identifying encrypted malicious traffic with low-quality training data. Particularly, BAPTISM selects a suitable base model for each task, and trains it with early stopping to generate traffic representation before overfitting occurs. Then, we devise an adaptive metric selection strategy to select confident samples. By employing two metrics (JSD and CSD) to assess the characteristic of traffic representation from distinct perspectives, we find the more proper metric for each class and apply it for confident sample selection. According to the confident samples and selected metric for each class, we develop a label correction tactic which adapts to class nature to improve the quality of training data. Finally, we employ parallel training strategy to train the base model with the corrected data, further mitigating the impact of low-quality data. We conduct experiments across three real-world malicious traffic datasets with various noise settings. The results demonstrate that BAPTISM is compatible with different base models and outperforms across noise ratios ranging from 20% to 90%. Meanwhile, BAPTISM consistently selects the confident samples with the highest purity and volume under each setting.

Abstract:
Malicious Python packages make software supply chains vulnerable by exploiting trust in open-source repositories like Python Package Index (PyPI). Lack of real-time behavioral monitoring makes metadata inspection and static code analysis inadequate against advanced attack strategies such as typosquatting, covert remote access activation, and dynamic payload generation. To address these challenges, we introduce DySec, a machine learning (ML)-based dynamic analysis framework for PyPI that uses eBPF kernel and user-level probes to monitor behaviors during package installation. By capturing 36 real-time features–including system calls, network traffic, resource usage, directory access, and installation patterns–DySec detects threats like typosquatting, covert remote access activation, dynamic payload generation, and multiphase attack malware. We developed a comprehensive dataset of 14,271 Python packages, including 7,127 malicious sample traces, by executing them in a controlled isolated environment. Experimental results demonstrate that DySec achieves 96% detection accuracy with an ML inference latency of <0.5s after dynamic feature extraction, reducing false negatives by 78.65% compared to static analysis and 82.24% compared to metadata analysis. During the evaluation, DySec flagged eleven packages that PyPI classified as benign. A manual analysis, including installation behavior inspection, confirmed six of them as malicious. These findings were reported to PyPI maintainers, resulting in the removal of four packages. DySec bridges the gap between reactive traditional methods and proactive, scalable threat mitigation in open-source ecosystems by uniquely detecting malicious install-time behaviors.

Abstract:
Despite its success in many applications, federated learning is increasingly vulnerable to sophisticated poisoning attacks. Existing defenses, particularly Byzantine Robust Aggregation Rules (BRARs), offer some protection but rely on strong assumptions or challenging technical prerequisites. To address these shortcomings, we propose an ensemble defense with provably convergent aggregation (EndPCA). By using the entropy weight method to consolidate scores from multiple BRARs into an ensemble trust score, it effectively integrates heterogeneous weak BRARs to resist a wide range of poisoning attacks under practical assumptions. We formally prove that EndPCA can provide theoretical guarantees of convergence with bounded error. Our empirical evaluations show that EndPCA consistently outperforms existing BRARs, demonstrating its effectiveness across various scenarios.

Abstract:
Specific emitter identification (SEI) is a security authentication technology by utilizing radio frequency fingerprinting (RFF) features. However, current mainstream RFF feature extraction methods based on neural networks (NNs) generally suffer from poor interpretability and difficulties in architecture optimization. To address these issues, we propose a novel convolution-enhanced swin vision transformer (CoSwinVIT) that combines filter characteristics of NN architectures to achieve a uniform spectrum response. Specifically, we treat the NN used in SEI tasks as filters and investigate their filter characteristics by applying Fourier transforms to hidden vectors. Through spectral analysis, we found that different NN architectures exhibit significantly different responses to the signal spectrum. This affects the model’s sensitivity to specific frequency bands of the signal, thereby influencing its accuracy (Acc). Subsequently, by integrating the filtering properties of swin vision transformer (SwinVIT) and convolutional neural network (CNN), we achieve a uniform spectral response design. Finally, to evaluate the performance of the CoSwinVIT architecture, we design both a supervised learning algorithm and a contrastive learning-based self-supervised algorithm. Experimental results on a real-world automatic dependent surveillance-broadcast (ADS-B) and wireless fidelity (WiFi) dataset indicate that CoSwinVIT provides a more uniform spectrum response. Under supervised learning, the proposed CoSwinVIT obtains the accuracies of 98.6% and 99.8% on the ADS-B and WiFi datasets, respectively. Under self-supervised learning with 5% and 10% labeled data, the accuracies of 73.6% and 86.3% are achieved on the ADS-B dataset, and the accuracies of 81.6% and 91.5% are achieved on the WiFi dataset. These results surpass the state-of-the-art (SOTA) methods used in the SEI tasks.

Abstract:
Recently, backdoor attack, which aims to implant malicious logic into deep learning models (DLMs), has attracted so extensive research attention. Among them, the non-poisoning-based backdoor attack appears considerable development prospects owing to the posed threats against the DLMs-based artificial intelligence applications in cyberspace. However, previous non-poisoning-based backdoor attacks for DLMs are limited to the impractical attacking forms, resulting in certain weaknesses in both attacking complexity and attacking adaptability. To tackle the mentioned issues, this paper proposes a novel backdoor attack framework, namely the shell code injection (SCI), to perform backdoor attacks against DLMs with lower complexity and higher adaptability. Specifically, for alleviating the attacking complexity, we elaborate the logic-driven stealthy backdoor shell motivated by the biological behavior in nature, e.g., the camouflage and attack strategy of crabs. By introducing the trigger consistency verification and short-circuit code packaging strategies, the SCI misleads the victim models to output wrong predictions without training requirements according to the preset poisonous decision logic. For enhancing the attacking adaptability, we design the LLM-assisted adaptive attacking target code generation that consists of the model concept detection module and the attack target adjusting module. Since the attacking goals could be generated dynamically according to the aware victim model information and appointed attacker preset instructions, the SCI could achieve more flexible attacking performance. Extensive experiments are conducted to demonstrate that the proposed backdoor attack framework appears awesome attacking ability (almost 100% ASR) under various settings. Additionally, we provide a case study on combining the cyber attack with SCI, which also exhibits certain space for imagination of new-type backdoor attacks. The code is released at https://github.com/WDQhello/Shell_attack/

Abstract:
The rapid advancement of Generative Adversarial Networks (GANs) and diffusion models has enabled the creation of highly realistic synthetic images, presenting significant societal risks, such as misinformation and deception. As a result, detecting AI-generated images has emerged as a critical challenge. Existing research emphasizes extracting fine-grained features to enhance detector generalization, yet they often lack consideration for the importance and interdependencies of internal elements within local regions and are limited to a single frequency domain, hindering the capture of general forgery traces. To overcome the aforementioned limitations, we first utilize a sliding window to restrict the attention mechanism to a local window, and reconstruct the features within the window to model the relationships between neighboring internal elements within the local region. Then, we design a dual frequency domain branch framework consisting of four frequency domain subbands of DWT and the phase part of FFT to enrich the extraction of local forgery features from different perspectives. Through feature enrichment of dual frequency domain branches and fine-grained feature extraction of reconstruction sliding window attention, our method achieves superior generalization detection capabilities on both GAN and diffusion model-based generative images. Evaluated on diverse datasets comprising images from 65 distinct generative models, our approach achieves a 2.13% improvement in detection accuracy over state-of-the-art methods. Our code is available at https://github.com/HorizonTEL/DFFreq-main

Abstract:
Online model training is pivotal for enabling multiuser semantic communication systems to adapt to dynamic channel conditions. However, conventional frameworks suffer from prohibitive communication overhead and vulnerabilities to privacy attacks, hindering practical deployment. This paper proposes semantic information mixup (SIMix), a secure and efficient training framework that integrates Over-the-Air Mixup (OAM) with label-aware user grouping to jointly optimize spectral efficiency and semantic security. The OAM mixes semantic features of multiple users via wireless channels, inherently obfuscating sensitive data while reducing communication overhead. A closed-form Tx-Rx scaling optimization minimizes the mean square error (MSE) of over-the-air computation under channel noise, ensuring stable convergence in low-SNR regimes. Furthermore, an extended max-clique algorithm dynamically partitions users into groups with minimal intra-label similarity, reducing model inversion attack success rates. Experiments on CIFAR-10 and Tiny ImageNet demonstrate that the proposed approach is superior in terms of communication efficiency and security, reducing communication overhead by up to 25% and attaining 17.58 dB PSNR (20.98 dB reduction) under inversion attack and reducing 13.44% attack success rate under label inference attack, while achieving comparable transmission accuracy.

Abstract:
With the growing adoption of Internet of Things (IoT) devices, ensuring the security of wireless communications has become increasingly critical. Radio frequency fingerprint identification (RFFI) has shown promise in this regard due to its capability of uniquely identifying devices. Although deep learning (DL) approaches have significantly improved RFFI performance, they typically rely on large-scale centralized data. This poses challenges in terms of privacy preservation and heterogeneous data distributions. To address the performance degradation caused by non-independent and identically distributed (non-IID) data in cross-receiver scenarios, this paper proposes a feature alignment strategy based on federated learning (FL) for RFFI. In such scenarios, due to differences in receiver hardware characteristics, deployment locations, and channel conditions, the signals captured by different receivers often exhibit distribution shifts, resulting in misaligned feature spaces across clients. The proposed method guides each client to learn aligned intermediate feature representations during local training, effectively mitigating the resulting adverse impact on model generalization. Experiments conducted on a real-world RF dataset demonstrate that the proposed method achieves higher identification accuracy and improved stability compared with representative federated baselines, including FedAvg and FedProx. The highest identification accuracy reaches 90.83%, and the performance gains are accompanied by generally reduced variance across different client configurations, highlighting the robustness and generalization capability of the proposed approach in heterogeneous wireless environments.

Abstract:
Specific emitter identification (SEI) separates the radio frequency fingerprint (RFF) from signals, which is of great significance in solving Internet of Things (IoT) security problems. However, the scarcity of high-quality, diverse, and labeled data in real-world scenarios limits the application of SEI. Under such conditions, the SEI is referred to as few-shot SEI (FS-SEI). To surmount this challenge, we propose a diffusion model-based data augmentation method capable of generating a substantial volume of diverse, high-quality data. Specifically, we develop a multi-scale convolutional block attention module denoising diffusion probabilistic model (MSCBAM-DDPM), which enhances feature capture capabilities, laying the foundation for the generation of diverse data. Furthermore, we propose an adaptive two-stage multi-domain loss function that guides the model to learn the characteristics of the original data and further derive other similar features, thereby achieving the goal of generating diverse and high-quality data. Finally, we theoretically derive the feasibility of the proposed loss function and further demonstrate the excellent diversity and quality of the data generated by our method, as well as its considerable gain for FS-SEI, through extensive experiments on real-world signal datasets.

Abstract:
False data injection (FDI) attacks can mislead the system operator to conduct incorrect dispatch decisions, causing cyber-induced physical line overloads. However, traditional false data is constructed either further from normal data and easy to detect, or not effective to overload multiple lines. To improve both attack stealth and overload impacts, this paper proposes a bilevel cyber-induced overloads (CIO) mechanism that can cause a predefined number of multi-line overloads considering post-attack economic dispatch, where the injected false data is minimized to improve the attack stealth. Within this mechanism, a detailed CIO attack model is formulated that explicitly incorporates the post-attack economic dispatch, enabling it to design more practical attack strategies. One advanced feature of this CIO attack model is the optimal selection of overloaded lines for the bilevel optimisation of cyberattack resources and post-attack impacts in terms of line overloads, operation costs, and load loss. To solve the proposed model, it is converted into a single-level nonlinear model by a strong duality theory, and then CIO attacks are discretised to convert this model into a mixed-integer linear programming (MILP) problem. Case studies conducted on an IEEE 14-bus power system and an industrial 39-bus power system validate the superiority and effectiveness of our proposed approach.

Abstract:
UAV-based e-commerce, which employs uncrewed aerial vehicle (UAV) to deliver commodities from sellers to buyers, plays a central role in the low-altitude economy. In the interests of both buyers and sellers, UAV-based e-commerce usually relies on a fair exchange protocol between them. Despite extensive research on fair exchange protocol design, UAV-based e-commerce has two features posing extra challenges to achieving mutual fairness: delivery fraud and path manipulation. On one hand, the UAVs usually deliver real-world commodities rather than virtual assets (e.g., digital books), making it hard to verify the physical delivery status. Particularly, this verification can be easily done for virtual assets by checking their hash values, but inapplicable to real-world commodities, as they cannot be naturally hashed. On the other hand, since UAV costs may vary in different air areas, the seller may manipulate delivery bills by claiming unnecessarily expensive paths. These challenges are newly emerged in UAV-based e-commerce and have not been considered in traditional protocols. The primary goal of this work is to propose the first fair exchange protocol that achieving mutual fairness in UAV-based e-commerce. To verify commodity delivery, we develop a tracing mechanism that transforms raw UAV-collected footage into immutable delivery evidence, which can be integrated into existing virtuality-oriented protocols for further verification. As for the path manipulation problem, we introduce an auditing mechanism that enables buyers to verify that the chosen delivery path was generated by a trustworthy algorithm (e.g., a well-trained artificial general intelligence model). By establishing these two mechanisms on cryptographic tools, we theoretically prove the fairness of our proposed protocol. We also implement a prototype and observe that the whole protocol has only minute-magnitude cost across different settings, which validates its practicality.

Abstract:
Connected autonomous vehicles (CAVs) utilize multi-modal sensors, such as LiDAR and high-definition cameras, to collect diverse types of sensing data. Fusing object detection information from these two modalities facilitates more accurate environmental perception. In this context, lightweight secret sharing techniques are employed to protect information privacy, enabling further calculation while effectively alleviating the computational resource constraints of CAVs. Meanwhile, such techniques require an additional third-party to generate some necessary random numbers. Addressing the challenges of privacy disclosure of multi-modal object information and the reliability of random numbers, we propose a malicious third-party-resistant privacy-preserving multi-modal object fusion model, termed MPOF. First, we develop a series of secure computation protocols that do not rely on time-consuming cryptographic primitives, including secure multiplication, secure sharing conversion, and secure comparison. Leveraging the idea of sacrificial verification, we can effectively detect malicious behavior by the third-party during the random number generation process. Second, we construct a secure object bounding-box matching module based on arithmetic secret sharing (ASS), enabling similarity calculation and matching of bounding-boxes between point cloud and image modalities. Additionally, we design a secure object score fusion module that achieves fusion and updating through secure implementations of convolution, ReLU, and Maxout operations. Detailed theoretical analysis and experimental results demonstrate that, compared to secure computation protocols using homomorphic encryption for random number generation, the proposed protocols reduce computational overhead by five orders of magnitude. Furthermore, the MPOF model constructed by integrating these protocols is secure, accurate, and efficient.

Abstract:
Machine Learning (ML) malware detectors rely heavily on crowd-sourced AntiVirus (AV) labels, with platforms like VirusTotal serving as trusted sources of malware annotations. But what if attackers could manipulate these labels to classify benign software as malicious? We introduce label spoofing attacks, a new threat that contaminates crowd-sourced datasets by embedding minimal and undetectable malicious patterns into benign samples. These patterns coerce AV engines into misclassifying legitimate files as harmful, enabling poisoning attacks against ML-based malware classifiers trained on those data. We demonstrate this scenario by developing AndroVenom, a methodology for polluting realistic data sources and launching subsequent poisoning attacks against ML malware detectors. Experiments show that not only are state-of-the-art feature extractors unable to filter such injections, but various ML models experience Denial-of-Service (DoS) with as little as 1% poisoned samples. Additionally, attackers can flip decisions for specific unaltered benign samples by modifying only 0.015% of the training data, threatening their reputation and market share, while evading anomaly detectors operating on the training data. We conclude by raising concerns about the trustworthiness of ML training processes based on AV annotations and argue that further investigation is needed to develop more reliable labeling strategies.

Abstract:
Sharding is an important solution to improve the scalability of blockchain. The basic idea of blockchain sharding is to separate transactions among multiple disjoint shards processing in parallel to maximize system performance. The current sharding protocols mainly rely on node rotation (namely, node allocation and migration) randomly among shards periodically to ensure security, which is often considered the most challenging when developing a sharding system. However, (1) if a shard or multiple shards are corrupted, the sharding system will not be available anymore; (2) to avoid shard corruption, the demand of large size of each shard limits the throughput performance of sharding protocols. To solve (1), we introduce a blockchain sharding protocol with scale-out transaction processing capacity called CShard. The main idea of CShard is to use repairable fountain codes (RFCs), an information coding method with the locality feature, to innovate sharding design. By adjusting encoding parameters, topological associations among shards are constructed, which are then utilized to define the verification logic for transactions. The blocks of corrupted shard(s) can be recovered through decoding by its corresponding shard group(s), and the sharding system is still secure and available. Our approach utilizes encoding techniques to build a general architecture of a sharding system, establishing a paradigm of “encoding as verification” and showcases new horizons in the field of blockchain sharding. To solve (2), we propose the ghost reporter mechanism that gives all nodes chances to verify a transaction by submitting reports in the sharding network. The mechanism brings two direct benefits. Firstly, it provides the way to detect corrupted shards and recover the blocks by RFCs; the second is to make the number of nodes in a single shard smaller, which solves the limitation of existing sharding schemes that usually require a larger number of nodes in a single shard to ensure security. In principle, this mechanism can also be applicable to the known and even unknown sharding protocols for its generality.

Abstract:
In this paper, we explore the vulnerability of the physical uplink shared channel (PUSCH) to a new smart jamming attack in fifth generation (5G) new radio (NR), where an intelligent adversary first executes its attack by sniffing the downlink control information (DCI)-indicated resource scheduling information and then disrupts the PUSCH data transmission effectively and covertly by the precise jamming. To combat such kind of DCI sniffing based smart jamming (DCIS-SJ), we propose a novel method for effective DCIS-SJ suppression leveraging the DCI-scheduled subset identification and the PUSCH resource reconstruction. Our method fundamentally relies on the differences in the spatial domain feature under available control channel elements and resource block group granularities between legitimate users and the DCIS-SJ attacker, to selectively exclude unwanted elements while safeguarding the authenticity of the targeted transmissions. Numerical results evaluate and confirm the effectiveness of our method.

Abstract:
Secure off-chain payment plays an important role in blockchain ecosystems, requiring atomicity guarantees where either all payment channels update their balances or none do. While the Lightning Network achieves this through multi-hop Hash Time-Lock Contracts (HTLCs), a critical flaw persists. Specifically, contract conditions (e.g., hash preimages) are transmitted in plaintext across payment paths. Consequently, malicious intermediate nodes can launch interception attacks by claiming funds from predecessors without releasing the corresponding coins to successors. To mitigate this risk, we propose a secure off-chain payment protocol. Firstly, a new cryptographic primitive, namely ciphertext unlinkable autonomous path proxy re-encryption scheme (CUAP-PRE), is proposed. Unlike traditional multi-hop proxy re-encryption, it prevents malicious nodes by enabling the delegator to designate all delegatees he trusts. In addition, ciphertext unlinkability resists homology inference attacks and delegation path tracing to ensure anonymity. Building on CUAP-PRE, a secure off-chain payment protocol (SOCP) in blockchain is designed with an enhanced multi-hop Hash Time-Lock Contract of the Payment-Channel Network (PCN). The receiver at the end of one path of payment channel can control the decryption rights of payment condition in multi-hop delegation manner with reversed order, to unlock the corresponding bitcoins on hold. We then present formal security proofs to demonstrate that the proposed CUAP-PRE and SOCP are post-compromise secure under the Decisional Bilinear Diffie-Hellman (DBDH) assumption in the random oracle model. Comprehensive evaluations also demonstrate the effectiveness and practicability of our proposal.

Abstract:
The rise in network attacks has made robust malicious traffic detection crucial. However, the dynamic nature of network traffic causes concept drift, undermining the efficacy of traditional detection methods, which often rely on a static i.i.d data environment and struggle to adapt to new patterns. To overcome these limitations, we propose Argus, a novel framework for malicious traffic detection that operates in a comprehensive, automated, and adaptive manner. Argus tackles three core challenges: accurately classifying known traffic while detecting drift, automatically identifying malicious drifting traffic, and maintaining performance through continuous updates. To address these challenges, Argus integrates a contrastive learning-based module to produce compact representations of traffic and implements a fine-grained drift detection method using category-specific reconstruction loss distributions. For drifting traffic, Argus uses clustering-based automated identification to detect attacks without human intervention. Furthermore, a distance-constrained update mechanism ensures smooth model adaptation, preserving stability and accuracy. Extensive experiments demonstrate that Argus achieves superior performance, with an average F1 score exceeding 95% under various conditions and retaining robust performance even under extreme drift scenarios.

Abstract:
We study the problem of leaky private information retrieval (L-PIR), where the amount of privacy leakage is measured by the pure differential privacy parameter, referred to as the leakage ratio exponent. Unlike the previous L-PIR proposed by Samy et al., which is merely a re-allocation of the clean (low-cost) retrieval pattern within the generalized TSC family, we show that the active pure-DP constraints couple adjacent Hamming-weight layers of the random key, which reduces the optimization to a layered problem whose optimum is geometric across these layers. As a result, only cyclic permutations are needed without loss of optimality, and lower-Hamming weight keys should be assigned higher probabilities. This new scheme provides a significant improvement, leading to an O(\log K) leakage ratio exponent with fixed download cost D , in contrast to the previous art that only achieves a \Theta (K) exponent, where K is the number of messages.

Abstract:
Graph-based Retrieval-Augmented Generation (RAG) has achieved remarkable success in refining the outputs of Large Language Models (LLMs), enabling them to integrate relational and multi-hop knowledge into context-aware responses by constructing a knowledge graph from an external database. In this paper, we focus on the underexplored security risks arising from the external database, and propose the first backdoor attacks against the graph-based RAG of LLMs. Specifically, attackers insert the backdoor into the knowledge graph as entities by poisoning a carefully crafted corpus into the external database, thereby causing LLMs to output attacker-desired answers for trigger-containing queries while preserving correct answers for others. The attacks are formulated as a minimax problem, whose solution is a poison corpus. Powered by the chain-of-thought reasoning capabilities of LLMs, we propose a new strategy to solve the minimax problem. We craft retrieval text to insert triggers into the knowledge graph as entities, exploit hijacking text to redirect LLMs’ attention toward attacker-desired answers, and finally link the hijacking text to the triggers so that it serves as context only for trigger-containing queries. In addition, our attacks involve three types of triggers, including word-level, topic-level, and semantic-level, with progressively increasing stealthiness. Empirical results across multiple knowledge databases and language models indicate that the proposed attacks achieve the desired attack performance. Our findings highlight the substantial risks in LLM applications (e.g., chatbots and agents) built on graph-based RAG systems.

Abstract:
Decentralized Access Control (DAC) manages access through multiple entities, consisting of two modules: decentralized secret management and access policies. However, existing DAC schemes lack support for managing secrets with time-based conditions, such as triggering secret release after a certain time bound. In this case, users may gain access to information before the designated time, which is undesirable in scenarios involving time-sensitive data. Moreover, current DAC schemes mainly focus on identity confidentiality and lack support for policy confidentiality, which may lead to leakage of sensitive information in access policies. To address these challenges, we propose \textsf Heimdall , a decentralized access control scheme with time-based secret management and private access policies. The core of our solution is the \textsf dhNIZK protocol, an efficient non-interactive zero-knowledge protocol designed for the verifiable incorporation of time conditions into threshold cryptosystems. We utilize this \textsf dhNIZK protocol and homomorphic time-lock puzzles to enable time-based secret management, improving the efficiency of secret reconstruction through batch puzzle-solving techniques. Furthermore, we enhance the garbling scheme’s encoding algorithm to ensure policy confidentiality while maintaining identity confidentiality. Finally, we implement \textsf Heimdall and present experimental results demonstrating its superior performance compared to the state-of-the-art solutions.

Abstract:
Uncrewed aerial vehicles (UAVs) are increasingly employed to perform high-risk tasks that require minimal human intervention. However, they face escalating cybersecurity threats, particularly from GNSS spoofing attacks. While previous studies have extensively investigated the impacts of GNSS spoofing on UAVs, few have focused on its effects on specific tasks. Moreover, the influence of UAV motion states on the assessment of cybersecurity risks is often overlooked. To address these gaps, we first provide a detailed evaluation of how motion states affect the effectiveness of network attacks. We demonstrate that nonlinear motion states not only enhance the effectiveness of position spoofing in GNSS spoofing attacks but also reduce the probability of detecting speed-related attacks. Building upon this, we propose a state-triggered backdoor attack method (SSD) to deceive GNSS systems and assess its risk to trajectory planning tasks. Extensive validation of SSD’s effectiveness and stealthiness is conducted. Experimental results show that, with appropriately tuned hyperparameters, SSD significantly increases positioning errors and the risk of task failure, while maintaining high stealthy rates across three state-of-the-art detectors.

Abstract:
Hardware supply-chain attacks are raising significant security threats to the boot process of multiprocessor systems. In this paper, we investigate critical stages of the multiprocessor system boot process and identify a new, prevalent hardware supply-chain attack surface that can bypass secure boot due to the absence of processor-authentication mechanisms. To defend against such attacks, in this paper, we present PA-Boot, the first formally verified processor-authentication protocol for secure boot in multiprocessor systems. PA-Boot is proved functionally correct and is guaranteed to detect multiple adversarial behaviors, such as processor replacements and man-in-the-middle attacks. The fine-grained formalization of PA-Boot and its fully mechanized security proofs are carried out in the Isabelle/HOL theorem prover with 348 lemmas/theorems and ~7,100 LoC. We further implement in C an instance of PA-Boot. Experiments on the proof-of-concept implementation indicate that PA-Boot can effectively identify boot-process attacks with a minor overhead (4.98% on Linux boot process) and thereby improve the security of multiprocessor systems.

Abstract:
With the enhanced performance of large language models (LLMs) on natural language processing tasks, potential moral and ethical issues of LLMs arise in the Web service. Malicious attackers exist who induce LLMs to jailbreak and generate information containing illegal, privacy-invasive information through techniques such as prompt engineering. As a result, LLMs counter malicious attackers’ attacks using techniques such as safety alignment. However, the strong defense mechanism of the LLMs through rejection replies, which are typically unrelated to the attacker’s query, is easily identified by attackers and leaks information that strengthens their capabilities. We propose an LLM-driven defense task that generates safe and disguised responses which remain relevant to the attack request while concealing defensive intent and withholding any useful information from the attacker. To achieve this task, we construct an LLM-based multi-agent attacker-disguiser game that trains two LLM-based players to select in-context learning (ICL) examples and strategies, guiding them to better generation and approach to Nash equilibrium. To enhance the LLMs’ generation via ICL, we apply the Minimax Q-learning to optimize the players for choosing ICL examples. To further refine generation, we use policy gradient (PG) to train the strategy selectors to select prompt strategies for LLMs. To improve the ICL examples and prompts for LLMs’ generation, we iteratively train the players with ICL sample selection training and prompt strategy training. The goal of this game is to arrive at the Nash equilibrium. Through training, we obtain an LLM-based defender which can receive jailbreak prompts from malicious users and output safe and disguised responses. The experiments reveal that our method strengthens the LLM’s ability to disguise the defense intent compared to other methods. Our method also minimizes the attacker-perceived drop in attack success rate, while delivering the largest actual reduction. This effectively misleads the attacker’s evaluation and enhances practical defense utility.

Abstract:
Inconsistencies in manufacturing features, sampling settings, and cryptographic implementations amongst the profiling and target devices can lead to the failure of profiling side-channel analysis (SCA). Various techniques, such as preprocessing, multi-device training, and transfer learning, have been proposed to mitigate this portability problem in profiling SCA. However, many techniques of block ciphers, such as tweaks, key-dependent components, and customized elements, might have uncertain effects from the perspective of cryptographic implementations, requiring further insightful analysis on their impact on portability. This paper investigates the portability of profiling SCA from a case study using adjustable implementations of block ciphers. First, we theoretically analyze the variation in leakage distribution under adjustable implementations. To support our theoretical results, a dataset of deep-learning SCA is built from AES, Pilsung, and Skinny. Specifically, we reveal how to reverse the parameterized components and recover the key from these adjustable implementations. According to our experiment on an 8-bit AVR microcontroller, the computational complexities of the attacks based on our model are less than 9× 2^16 within 4500 traces. Moreover, the effectiveness of our proposed method is demonstrated under the combinatorial effect with adjustable implementations and device characteristics. Our case study provides insights into the results of adjustable implementations of block ciphers, which strengthens both the theoretical and practical understanding of the portability of profiling SCA.

Abstract:
Simple Power Analysis is a commonly used method in Side-Channel Analysis on cryptosystems, which requires a significant amount of labor costs for segmentation. General Pulse Tailor for Simple Power Analysis (SPA-GPT) proposed in CHES 2024 utilizes reinforcement learning to achieve automated segmentation. However, its low efficiency and only targeting public-key algorithms limit its practical applications. In this paper, we propose a practical method, which utilize long short-term memory network and attention mechanism, coupled with a new deep Q-network policy using Simulated Annealing strategy, to solve the contradiction between reinforcement learning and high efficiency in trace segmentation. Moreover, the novel agent proposed in this paper also demonstrates transferability, enabling direct segmentation of a trace under varying lengths and signal-to-noise ratio conditions once the agent has been fully trained. In addition, our new approach is applicable for locating each execution of block ciphers in various encryption modes. Comparative experiments are conducted on 14 datasets, which are collected from software or hardware implementations of RSA, ECC, ML-KEM, AES, PRESENT, and SIMON, running on microcontrollers, FPGAs, or smart cards. Experimental results show that the new method enhances time efficiency by 50.34% to 94.24% while reducing network parameters by 87.84% compared to SPA-GPT.

Abstract:
The Mobile Crowdsensing (MCS) are emerging as a new paradigm across various domains. Reputation plays a crucial role in selecting reliable workers, which is essential for ensuring data quality. However, existing privacy-protection protocols restrict the sharing of worker reputation across-platforms, leading to difficulties in obtaining high-quality data and the Cold Start for Trust Evolution (CSTE). To address above issues, we propose a Cross- Platform Reputation Privacy Sharing (CPRPS) scheme aimed at accelerating high-quality data collection. Specifically, the CPRPS introduces two key components: 1) a privacy-preserving protocol that ensures secure, efficient sharing of reputation, and 2) a novel cross-platform reputation sharing scheme that integrates cryptographic methods with blockchain. Together, these components enable the secure sharing of trustworthy worker reputations across-platforms, while ensuring strong privacy protection for all workers. Additionally, leveraging the key components, we design a reputation evolution scheme and a multi-factor-assisted Truth Discovery (TD) algorithm, which facilitate rapid, accurate worker reputation assessments, thereby improving data quality. Theoretical analyses of the CPRPS scheme’s security and computational complexity, along with extensive simulations on real-world datasets, demonstrate that the CPRPS scheme improves worker reputation speed by 9.3%-48.0%.

Abstract:
Escalating cybersecurity risks in communication networks pose severe challenges to the stability and reliability of microgrid control systems. This paper investigates the load frequency control problem of nonlinear networked wind-power microgrids under denial-of-service (DoS) attacks and event-triggered communication. A data-driven prediction framework is first established to construct a control-oriented dynamic linearization data model for the underlying nonlinear system. Subsequently, a novel event-triggered model-free adaptive predictive control (ET-MFAPC) algorithm with time-varying parameters is proposed, which integrates multi-step adaptive prediction and receding-horizon optimization. The proposed algorithm features several key innovations: 1) An output-related auxiliary variable is designed to circumvent unavailable pseudo-partial derivatives (PPDs) and their sign constraints. 2) An adaptive predictive compensation module is developed to mitigate the impacts of DoS attacks by reconstructing hijacked data packets. 3) An attack-aware online optimization mechanism is formulated to adaptively tune controller gains for improved system performance. Furthermore, a comprehensive security analysis is conducted based on a model-independent data energy function, and explicit boundary conditions are derived for tolerable attack frequency and duration. Finally, the effectiveness and robustness of the proposed method are validated through simulation studies.

Abstract:
Subgraph Federated Learning (FL) has emerged as a promising paradigm for node classification tasks wherein subgraphs derived from a global graph are distributed across multiple devices to mitigate data leakage risks. Similar to other FL systems, subgraph FL faces significant security challenges, particularly from backdoor attacks, an area that remains extensively underexplored. Existing attacks typically follow a two-phase strategy to implant backdoors. However, in subgraph FL, such attacks often lead to Divergence Amplification, a phenomenon characterized by significant parameter discrepancies between normal and backdoored models, thereby compromising attack stealthiness. To tackle this challenge, we propose BEEF, a Backdoor attack with an End-to-End Framework designed for effectiveness, stealth, and durability. Unlike conventional methods, BEEF incorporates a dedicated trigger generator, which is jointly trained with a backdoored model. To increase its stealthiness, BEEF crafts adversarial perturbations as triggers that provoke misclassification while leaving the model’s parameters entirely untouched. Furthermore, by calibrating a subset of low-salience parameters associated with backdoor activation, BEEF ensures stable performance and sustained effectiveness across FL rounds. Comprehensive evaluations across eight datasets, four models, five state-of-the-art attacks, and six aggregation methods demonstrate BEEF’s effectiveness in deceiving GNNs while maintaining minimal impact on normal data performance. Additionally, we adapt BEEF to federated graph classification tasks, broadening its applicability and practicality.

Abstract:
The proliferation of maliciously altered short videos on social media platforms poses a significant threat to information security ecosystems, eroding public trust in digital media. Despite recent advancements in detecting fake video news, significant challenges remain in the forensic analysis of short videos, leading to issues of bias. First, as technology rapidly advances, fake videos are becoming increasingly semantically convincing, undermining the effectiveness of current classification methods. Second, the heterogeneous nature of video modalities (visual, textual, audio) creates critical challenges for models to learn discriminative feature representations. To address these challenges, we propose a dynamic query framework for fake news forensics in short videos, termed the Semantic Guided Adaptive Network (SGAN). Our approach is motivated by the need to utilize superficial alignment to identify suspicious manipulations through anchor-based verification and to leverage the adaptive capability of learnable queries to learn the heterogeneous boundary in each modality. Specifically, SGAN comprises a verification module and a flexible query learning module. The verification module employs text as the anchor to verify detailed context, mining fine-grained information while emphasizing key features, thereby providing candidate manipulations for downstream modules. The query learning module leverages learnable queries to map heterogeneous forensic features and integrates them through multi-level fusion for decision-making. Extensive experiments conducted on two widely used datasets demonstrate the effectiveness and generalization of the proposed method. The code is released on github https://github.com/VILAN-Lab/SGAN

Abstract:
With increasing security threats in critical infrastructure, intelligent X-ray inspection systems have become essential for modern security frameworks. Contraband detection faces dual challenges: semantic dilution across cross-scale features and degraded directional sensitivity. Existing detection paradigms rely on Euclidean spatial assumptions while ignoring X-ray projection geometry characteristics, leading to feature representation ambiguity and insufficient spatial relationship modeling in complex occlusion scenarios. To address these challenges, we propose RFFRDet, a rotation detector based on refined feature decoupling. First, an Integrated Local Attention (ILA) module is constructed that enables material-aware and geometry-aware feature enhancement through channel-spatial decoupling. Second, a Multi-Resolution Feature Fusion (MRFF) network is designed that achieves optimal coupling between fine-grained spatial positioning and high-level semantic understanding through parallel multi-scale aggregation. Finally, we construct the first multi-directional X-ray security benchmark dataset with rotation annotations for 27,000 images. Extensive experiments demonstrate that RFFRDet achieves 97.4% mAR and 96.8% mAP, representing improvements of 1.6% and 1.5% over state-of-the-art methods, respectively.

Abstract:
Gait recognition aims to identify individuals based on walking patterns in a long-range, contactless manner. While camera-based methods have advanced significantly, their performance deteriorates under poor lighting conditions. LiDAR offers a promising alternative by capturing accurate 3D gait information regardless of illumination. However, effectively integrating heterogeneous data from diverse sensors, such as LiDAR and cameras, remains a key challenge for cross-modal gait recognition. Existing approaches often minimize modality discrepancy directly, which can lead to class collapse and damage to inter-class discriminability. To overcome these limitations, we propose a Semantic-Guided Cross-modal Gait recognition framework, SG-CrossGait, that introduces text features as the prototype space to bridge camera and LiDAR modalities. We design structured Gait Description Factors (GDF) and leverage multimodal large language models (MLLMs) for automatic factor annotation and text generation, enriching existing datasets with textual descriptions, yielding SUSTech1K-Text and FreeGait-Text. A CLIP-based pipeline aligns multi-grained representations from both modalities to the text prototype space. We further propose the Dual-stream Cross-attention Fusion (DCF) module for fine-grained feature integration and the Semantic-Guided Feature Decoupling (SGFD) module to disentangle shared and modality-specific features. A Multi-task Training (MT) scheme incorporating Gait Attribute Recognition (GAR) further enhances intra-class compactness. Extensive experiments validate the effectiveness of our approach. On SUSTech1K-Text, our method achieves 61% accuracy in LiDAR-to-Camera recognition, outperforming the state-of-the-art method by 8.3%. We also release the Gait-Text benchmark to promote future research at the intersection of gait analysis and vision-language learning. Code and datasets are available at: https://github.com/O-VIGIA/SCCG.git

Abstract:
Active defense strategies have been developed to counter the threat of deepfake technology. However, a primary challenge is their lack of persistence, as their effectiveness is often short-lived. Attackers can bypass these defenses by simply collecting protected samples and retraining their models. This means that static defenses inevitably fail when attackers retrain their models, which severely limits practical use. We argue that an effective defense not only distorts forged content but also blocks the model’s ability to adapt, which occurs when attackers retrain their models on protected images. To achieve this, we propose an innovative Two-Stage Defense Framework (TSDF). Benefiting from the intensity separation mechanism designed in this paper, the framework uses dual-function adversarial perturbations to perform two roles. First, it can directly distort the forged results. Second, it acts as a poisoning vehicle that disrupts the data preparation process essential for an attacker’s retraining pipeline. By poisoning the data source, TSDF aims to prevent the attacker’s model from adapting to the defensive perturbations, thus ensuring the defense remains effective long-term. Comprehensive experiments show that the performance of traditional interruption methods degrades sharply when these methods are subjected to adversarial retraining. However, our framework shows a strong dual defense capability, which can improve the persistence of active defense. Our code will be available at https://github.com/vpsg-research/TSDF.

Abstract:
Cloud-fog-assisted electronic health record (EHR) systems offer promising solutions for large-scale medical data storage and processing. However, they also raise critical privacy concerns, particularly regarding secure computation over sensitive data, fine-grained bilateral access control, dynamic revocation, and decryption key exposure. Existing cryptographic primitives, such as functional encryption and matchmaking encryption, address some of these challenges individually but fail to offer a unified solution. In this work, we design a revocable and privacy-preserving data computing system with bilateral access control (RPDC-BAC) for cloud-fog-assisted EHR sharing by proposing a novel cryptographic primitive, called server-aided revocable attribute-based matchmaking functional encryption (SR-AB-MFE). Specifically, the proposed scheme supports expressive bilateral access control and computation over encrypted data. In addition, it incorporates time-evolving decryption keys and a server-aided revocation mechanism to mitigate key exposure and efficiently revoke users. To further reduce receiver-side overhead, fog nodes assist in ciphertext authentication and partial decryption. We formally define the proposed primitive and prove its security under static assumptions. Finally, extensive experimental results demonstrate the efficiency and practicality of our design.

Abstract:
With the rapid proliferation of uncrewed aerial vehicles (UAVs) in civilian and industrial applications, the risk of malicious or unauthorized UAV use has become a critical security concern. Existing machine learning (ML)-based UAV recognition methods offer a certain degree of interpretability, but their performance is often limited in complex environments and across diverse UAV types. In contrast, deep learning (DL)-based methods exhibit strong representation capability, yet they generally lack physical interpretability. To address this issue, we propose an interpretable UAV recognition framework, termed frequency-aware network for UAV recognition (FANet-UAV), which performs coarse-to-fine feature learning in the frequency domain. Specifically, a multiplication filter module (MFM) is first designed to capture coarse-grained spectral patterns by exploiting multi-mode and multi-scale frequency characteristics of UAV signals. Based on these coarse representations, a convolutional neural network (CNN) is further employed to extract fine-grained discriminative features for accurate classification. Experimental results on two public UAV datasets demonstrate the effectiveness of the proposed method. In particular, FANet-UAV improves the recognition accuracy from 90.45% to 96.82% on DroneRFa and from 94.15% to 98.83% on DroneRF. Moreover, visualization results and channel-wise SHAP analysis provide both pre-hoc and post-hoc interpretability, revealing that FANet-UAV mainly relies on flight control signal (FCS) features for decision-making, while video transmission signal (VTS) features contribute less to the final recognition results.

Abstract:
With the rapid advancement of Web 3.0 technologies, public blockchain platforms are witnessing the emergence of novel services designed to enhance user privacy and anonymity. However, the powerful untraceability features inherent in these services inadvertently make them attractive tools for criminals seeking to launder illicit funds. Notably, existing de-anonymization methods face three major challenges when dealing with such transactions: highly homogenized transactional semantics, limited ability to model temporal discontinuities, and insufficient consideration of structural sparsity in account association graphs. To address these, we propose GradWATCH, designed to track anonymous accounts in Ethereum privacy-preserving services. Specifically, we first design a learnable account feature mapping module to extract informative transactional semantics from raw on-chain data. We then incorporate transaction relations into the account association graph to alleviate the adverse effects of structural sparsity. To capture temporal evolution, we further propose an edge-aware sliding-window mechanism that propagates and updates gradients at three granularities. Finally, we identify accounts controlled by the same entity by measuring their embedding distances in the learned representation space. Experimental results show that even under the conditions of unbalanced labels and sparse transactions, GradWATCH still achieves significant performance gains, with relative improvements ranging from 1.62% to 15. 22% in the MRR and from 3. 85% to 7. 31% in the F_1 .

Abstract:
Large Language Models (LLMs) undergo continuous updates to improve user experience. However, prior research on the security and safety implications of LLMs has primarily focused on their specific versions, overlooking the impact of successive LLM updates. This prompts the need for a holistic understanding of the risks in these different versions of LLMs. To fill this gap, in this article, we conduct a longitudinal study to examine the adversarial robustness–specifically misclassification, jailbreak, and hallucination–of three prominent LLM families: GPT, Llama, and Qwen. Our study reveals that LLM updates do not consistently improve adversarial robustness as expected. For instance, a later version of GPT-3.5 degrades regarding misclassification and hallucination despite its improved resilience against jailbreaks. GPT-4 and GPT-4o demonstrate (incrementally) higher robustness overall. Larger Llama and Qwen models do not uniformly exhibit improved robustness across all three aspects studied. In addition, larger model sizes do not necessarily yield improved robustness. Minor updates lacking substantial robustness improvements can exacerbate existing issues rather than resolve them. We hope our study can offer valuable insights into navigating model updates and informed decisions in model development and usage.

Abstract:
This paper investigates an explicit polar coding scheme for a two-user discrete memoryless multiple access wiretap channel with partial rate-limited feedback (MAC-WT-PLF). Feedback is exploited to increase channel-input correlation, inject dummy messages, and encrypt messages using a one-time pad. Existing MAC-WT polar coding schemes assume independent channel inputs, which cannot support the correlation requirement. Therefore, we propose an explicit polar mapping that leverages feedback to introduce correlation between the channel inputs. This mapping allows both the transmitter and the receiver to restructure the correlation in opposite decoding directions. The proposed scheme relies on source polarization, block Markov coding, superposition coding, lossy source coding, and rate-splitting, without making symmetry or degradation assumptions on the channel model. Rigorous information-theoretic asymptotic analysis establishes that the proposed scheme ensures both reliability and strong secrecy, and attains the entire achievable secrecy rate region given in prior work.

Affiliations: School of Automation, Guangdong University of Technology, Guangzhou, China; Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong; School of Cyberspace Science and Technology, Beijing Jiaotong University, Beijing, China; School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, China; College of Computing and Data Science, Nanyang Technological University, Jurong West, Singapore; Department of Computer Science, City University of Hong Kong (Dongguan), Dongguan, Guangdong, China

Abstract:
Smart contracts play a significant role in automating blockchain services. Nevertheless, vulnerabilities in smart contracts pose serious threats to blockchain security. Currently, traditional detection methods primarily rely on static analysis and formal verification, which can result in high false-positive rates and poor scalability. Large Language Models (LLMs) have recently made significant progress in smart contract vulnerability detection. However, they still face challenges such as high inference costs and substantial computational overhead. In this paper, we propose ParaVul, a parallel LLM and retrieval-augmented framework to improve the reliability and accuracy of smart contract vulnerability detection. Specifically, we first develop Sparse Low-Rank Adaptation (SLoRA), a technique for efficient LLM fine-tuning tailored to smart contract vulnerability detection. Distinct from existing LoRA methods, SLoRA inserts parallel sparse and low-rank branches after the attention projection and the feed-forward block, enabling LLMs to capture both global code semantics and localized vulnerability patterns while maintaining low training overhead. We then construct a vulnerability contract knowledge base and develop a hybrid Retrieval-Augmented Generation (RAG) system that integrates Okapi BM25 with dense retrieval to provide complementary lexical and semantic evidence for smart contract vulnerability verification. Furthermore, we propose a meta-learner-based gated verification module to fuse the outputs of the SLoRA detector and the two RAG-based detectors, thereby generating the final detection results. After completing vulnerability detection, we design chain-of-thought prompts to guide LLMs to generate comprehensive vulnerability detection reports. Simulation results demonstrate the superiority of ParaVul, especially in terms of F1 scores, achieving 0.9398 for single-label detection and 0.9930 for multi-label detection.

Abstract:
In a recent article, Bai et al. (2022) investigated the impact of outdated channel state information (CSI) on covert communication in wireless greedy relay networks. In this commentary, we identify a critical mathematical flaw in their derivation of the cumulative distribution function (CDF) of the outdated channel gain. The flaw fundamentally undermines the paper’s main results and conclusions regarding the impact of outdated CSI in such networks. To prevent future studies from building upon this incorrect foundation, we also provide a correction for the flaw and suggestions for future investigations into this problem. Beyond correcting a specific error, this commentary serves as a warning on the potential pitfalls in probabilistic modeling under outdated CSI.

Abstract:
Pretrained Language Models (PLMs) and Graph Neural Networks (GNNs) have emerged as promising approaches for software vulnerability detection. However, existing methods still face limitations, including the absence of fine-grained cross-modal interaction and the impact of data noise. Approaches integrating PLMs and GNNs fail to fully leverage their complementary strengths, while unreliable labels hinder generalization, further degrading real-world detection performance. To overcome these limitations, we propose Vul-CTG, a multimodal integration framework for software vulnerability detection that combines Code Text, and program Graph representations. Vul-CTG constructs enriched code graph representations by integrating statement-level source code graphs and abstract code property graphs, enabling more effective alignment between structural and semantic information. To enhance robustness against noisy labels and improve cross-modal consistency, the model incorporates contrastive learning and pre-training techniques. Central to Vul-CTG is CTG-Former, a novel alignment architecture that projects both code text and graph modalities into a unified latent space, allowing the model to capture complex structural and semantic patterns for more accurate vulnerability detection. Experimental results on recent function-level datasets demonstrate the effectiveness of Vul-CTG, showing an approximate 3% improvement in F1-score over state-of-the-art methods. Our code is available at https://github.com/ryxFry/Vul-CTG

Abstract:
Determining whether an input utterance includes fake or bona fide segments without additional segment annotation is a frequently investigated topic in Partially Spoofed Speech Detection (PSSD). Nevertheless, most existing works on this topic usually fail to sufficiently consider inter-segment temporal dependence within an utterance, and further, these works usually overlook representative possibly-fake segments, which may be critical for detecting spoofed utterances. In response, we propose an approach of Fake segment Mining based Graph neural network (FMG) to address these issues. For the first issue, we employ a Graph Neural Network (GNN) based module for processing segment-level representations, through comprising Adjacent Temporal Dependence (ATD) and Across Temporal Correlation (ATC) branches for jointly modelling the GNNs’ adjacent segments’ dependence and different segments’ correlation. Then, regarding the second issue, we propose a Fake Segment Mining (FSM) module, which contains attentive pooling, fake segment prototype loss and entropy loss parts, in order to achieve utterance-level predictions and pinpoint representative fake segments. Afterwards, experimental evaluations on spoofed-speech datasets demonstrate that, the proposed approach outperforms compared models, showcasing its effectiveness for detecting partially spoofed utterances without segment annotation.

Abstract:
Long-range attacks pose a significant threat to the integrity of Proof-of-Stake (PoS) blockchains by enabling adversaries to reconstruct an alternative chain history embedded with fraudulent transactions. These attacks can deceive honest participants into accepting a maliciously crafted branch as the canonical chain. While Key Evolving Signature (KES) schemes are widely adopted to mitigate such threats, they typically rely on the assumption that validators behave honestly. In this work, we challenge this assumption by demonstrating how a malicious validator can exploit inherent limitations in existing KES-based mechanisms to mount a successful long-range attack. To address this critical vulnerability, we introduce a novel cryptographic construction that combines one-time signatures with commitment schemes. Our approach imposes constraints on the signing capabilities of validators, thereby significantly reducing the feasibility of long-range attacks. We provide rigorous formal security proofs to substantiate the robustness of our scheme and conduct a comprehensive performance evaluation. The results show that our solution is both computationally and storage efficient, making it a practical and scalable defense mechanism for real-world PoS blockchain deployments.

Abstract:
Integrated Vehicular Networks (IVNs) require the exchange of large volumes of safety-critical broadcast messages, where ensuring message origin authenticity and data integrity can be more critical than message confidentiality. Many existing authenticated broadcast mechanisms either rely on classical public-key cryptographic techniques that are vulnerable to quantum adversaries or suffer from significant verification and revocation overhead under dense traffic conditions. This paper proposes QS-BTrust, a quantum-secure and privacy-preserving authenticated broadcast protocol that combines Physical Unclonable Functions (PUFs), post-quantum digital signatures, and Hashgraph-based authentication tag management. QS-BTrust enables immediate message authentication with constant-time verification complexity, supports pseudo-indistinguishable identities to achieve unlinkability, and enables revocation without introducing additional message overhead, even in dense traffic scenarios. The security of QS-BTrust is analyzed under the Dolev–Yao (DY) adversary model to establish protocol-level robustness against active network attacks and is formally validated using ProVerif to verify broadcast authenticity and resistance to message forgery. The protocol is further analyzed under both the Random Oracle Model (ROM) and the Quantum Random Oracle Model (QROM), demonstrating robustness against replay, impersonation, message forgery, Sybil, and quantum-assisted attacks. Security guarantees are evaluated using adversarial success probability, unforgeability, and impersonation advantage as attack metrics. Simulation-based performance evaluation further shows that QS-BTrust achieves lower verification overhead under high message loads compared to representative broadcast authentication schemes, while providing quantum resistance. These characteristics make QS-BTrust well suited for large-scale and critical vehicular communication environments.

Abstract:
In an open and complex environment of drone service collaboration scenarios, efficient identity authentication is essential for secure service interactions. Traditional schemes suffer from high computational costs, low communication efficiency, and vulnerability to single-point failures. This paper presents a new lightweight authentication scheme (BDP-Auth) based on blockchain decentralized identity (DID) and physically unclonable functions (PUFs). PUF enables low-power, low-cost identity key generation without relying on complex computation or large storage, ensuring tamper-proof identifiers while reducing hardware and energy overhead. Using blockchain decentralization, the DID mechanism securely registers and stores drone identities, offering transparency and integrity. Security analysis shows that BDP-Auth achieves forward and backward security, meaning that the compromise of the current session key does not reveal past or subsequent session communications, and also resists physical capture and impersonation attacks. Experiments demonstrate that, compared to PRLAP-IoD, BDP-Auth reduces the computation overhead by 38.3% and 45.0% on the STM32F4 and NodeMCU platforms, respectively, decreases communication overhead by 36.9%, and reduces storage overhead by 27.5%, significantly improving overall efficiency.

Abstract:
Traditional physical layer authentication (PLA) schemes in wireless networks typically depend on individual nodes for feature observation, lacking cooperative gain and thus suffering from limited accuracy and robustness. This paper investigates a cooperative transmission and PLA framework for wireless networks, wherein multiple legitimate devices serve as both message relays and identity verifiers by extracting hardware fingerprints. We first establish a theoretical model for the system’s bit error rate (BER), false alarm rate (FAR), and detection probability (PD), and derive closed-form upper bounds to characterize the transmission reliability and authentication performance. Based on our theoretical model, we then define an accuracy improvement ratio (AIR) metric that quantifies FAR gain without compromising BER and PD performance, and derive channel conditions to ensure a positive AIR. To validate the proposed integrated framework, a time-division multiple access (TDMA)-based case study is conducted using carrier frequency offset as the authentication feature. Simulation results demonstrate that the proposed scheme can reduce FAR by up to 100.0% under favorable signal-to-noise ratio (SNR) conditions, while maintaining the BER and PD performance of the conventional non-cooperative scheme, thereby confirming its effectiveness for enhancing secure wireless communications.

Abstract:
Multimodal biometric systems face critical deployment challenges due to frequent partial modality missing and unreliable modality quality (e.g., blurred fingerprints). Existing methods, relying on unidirectional feature interaction or static fusion, suffer catastrophic performance degradation under such conditions. This paper proposes an adaptive hybrid fusion network (AHFNet) for multimodal hand recognition, which dynamically selects and reweights modalities to correct errors induced by missing or low-quality inputs. The proposed framework comprises three core modules. Initially, a wavelet feature extraction module is implemented to capturing high-frequency biometric feature representations across modalities. Subsequently, a dual-path dynamic fusion module is designed using symmetric cross-attention for bidirectional feature reconstruction, preserving modality specificity via residual projections. Furthermore, this module employs the attention mechanism to perform reweighting, enabling the construction of a dynamic inter-modal compensation pathway at the feature level. Finally, a category-aware dynamic decision fusion module dynamically generates category-level weight coefficients based on input samples, enabling sample-adaptive optimization of multimodal contributions. Experiments on five public hand biometric databases demonstrate AHFNet’s superiority over existing methods across diverse multimodal scenarios, validating its broad applicability. Crucially, AHFNet emphasizes modal reliability differences. Even when certain modalities are missing, it exhibits less than 5% accuracy degradation—a relative improvement of over 75% compared to existing approaches, which typically suffer from more than 20% degradation. This enhanced robustness is consistently observed across all five databases.

Abstract:
Large language models (LLMs) are currently at the forefront of the machine learning field, showing broad application prospects but at the same time presenting certain risks of privacy leakage. Both the training datasets and the user’s input data during interactions face security issues, which need to be addressed urgently before their further development. To address this problem, we combine privacy-preserving techniques such as fully homomorphic encryption (FHE) and provable security theory with parameter-efficient fine-tuning (PEFT) to propose an efficient and secure inference scheme for LLMs that protects both the user-side’s input and the server-side’s private parameters. More specifically, we focus on pretrained LLMs that rely on open-sourced base models and then are fine-tuned with private datasets by LoRA. This is a popular roadmap for vertical domain large models such as LawGPT and BenTsao. To achieve this efficient and secure inference LLM scheme, we use two key technologies that are summarized below: 1) we divide the whole model into two parts, denoted as the public part and the private part. The weights of the public part are publicly accessible (e.g., the open-sourced base model), whereas those of the private part need to be protected (e.g., the LoRA matrices). Then, the public part is deployed on the client side, and the server maintains the private part. In this way, the overhead associated with computing on private data can be greatly reduced. 2) we propose a general method to transform a linear layer into another one that provides security against model extraction attacks and preserves its original functionality, denoted as the private linear layer (PLL). Afterwards, we use this method on the LoRA matrices of the server side, where the PLL changes the computation of the LoRA matrices in a way that accomplishes correct inference and ensures that the server protects their private weights without restricting the user’s input. We also show that the difficulty of performing model extraction attacks for the PLL can be reduced to the well-known hard problem of learning with errors (LWE). Combining this method with FHE, we can obtain an inference algorithm for fine-tuned LLMs that protects the user’s input and the server’s private weights at the same time. In this paper, we use the open-source model ChatGLM2-6B as the base model, which is fine-tuned by LoRA. The experimental results show that the inference efficiency of our scheme reaches 1.61 s/token, demonstrating that the scheme is highly practical.

Abstract:
The gaming industry has experienced substantial growth, but cheating in online games poses a significant threat to the integrity of the gaming experience. Cheating, particularly in first-person shooter (FPS) games, can lead to substantial losses for the game industry. Existing anti-cheat solutions have limitations, such as client-side hardware constraints, security risks, server-side unreliable methods, and both-sides suffer from a lack of comprehensive real-world datasets. To address these limitations, the paper proposes Hawk, a server-side FPS anti-cheat framework for the popular game CS:GO. Hawk utilizes machine learning techniques to mimic human experts’ identification process, leverages novel multi-view features, and is equipped with a well-defined workflow. Hawk is evaluated with the first large and real-world datasets containing multiple cheat types and cheating sophistication, and it exhibits promising efficiency and acceptable overheads, shorter ban times, higher recall and similar false positive rate compared to the in-use anti-cheat, and the ability to capture cheaters who evaded official inspections.

Abstract:
Existing CNN inference frameworks based on FHE often suffer from reduced efficiency and accuracy due to the polynomial approximation of activation functions, and they lack effective mechanisms to prevent sensitive information leakage during the final classification stage. To address these limitations, we propose FSAT, a fast and secure inference framework enhanced with adversarial training. Specifically, FSAT employs a private CNN model architecture, where linear layers are computed through an optimized homomorphic ciphertext convolution operation, while non-linear layer operations are efficiently realized using a secure searchable index and an encrypted look-up table, which replace polynomial activation approximations and significantly improve inference accuracy and latency performance. To further mitigate information leakage, we introduce a dual-constraint adversarial training scheme that makes it substantially more difficult for an adversary to infer sensitive attributes of the input data. Experimental results demonstrate that FSAT achieves high inference accuracy and efficiency while substantially reducing the risk of sensitive data leakage.

Abstract:
Vertical Split Federated Learning (VSFL) allows participants to collaboratively train a better model with different features vertically partitioned in the same sample space, where the model is divided into bottom model and top model by the cut layer, trained by passive and active participants respectively. However, in the process, the labels owned by the active participant will still be inferred or stolen by curious or malicious passive participants. In this paper, we propose Casper, a causality-inspired defense mechanism with a confounder against label inference attacks in VSFL. Casper first analyzes the feasibility of optimizing the training process in VSFL at the intervention level from a causal perspective. It then introduces a confounder consisting of cut layer output reconstruction and label obfuscation to disrupt the direct causality between cut layer outputs and labels. Additionally, we integrate selective discrepancy training to further ensure model utility by strategically balancing training between active and passive participants. Extensive experiments conducted on four datasets across different tasks demonstrate that Casper effectively preserves label privacy while maintaining model performance, significantly outperforming current advanced defending methods in VSFL.

Affiliations: Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan, China; School of Artificial Intelligence, Optics and Electronics (iOPEN) and the School of Mechanical Engineering, Northwestern Polytechnical University, Xi’an, Shaanxi, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR) and the School of Software, Shandong University, Jinan, China

Abstract:
Provenance graph-based anomaly detection, particularly for Advanced Persistent Threat (APT) detection, addresses the issues of large-scale graphs and data imbalance. However, existing methods struggle with information loss, high computational complexity, and low detection accuracy. To address the above challenges, this paper proposes TraceCluster, a lightweight and adaptive clustering-based Subgraph Attention Network (SAN) for APT detection in provenance graph. TraceCluster mitigates the neighborhood explosion problem by clustering nodes to partition large-scale graphs, thus reducing reliance on the global graph while preserving local neighborhood information. Furthermore, the method dynamically models complex inter-node dependencies within subgraphs. It employs an attention mechanism to adaptively highlight the most relevant connections. This enhances node representations and improves overall feature extraction. This design substantially reduces memory consumption and avoids the high computational complexity of global graph processing. In addition, an adaptive category-weighting loss function assigns variable weights to different classes, improving the detection of rare and anomalous behaviors. Experimental results show that on the OpTC dataset, the currently faster method is 37-fold and 3-fold slower than our approach in terms of inference time respectively. Furthermore, in the nine real-world scenarios of four evaluated datasets, TraceCluster outperforms state-of-the-art (SOTA) approaches in terms of overall performance, especially in node-level APT detection tasks.

Abstract:
Recent AI agents, such as ChatGPT and LLaMA, primarily rely on instruction tuning and reinforcement learning to calibrate the output of large language models (LLMs) with human intentions, ensuring the outputs are harmless and helpful. Existing methods heavily depend on the manual annotation of high-quality positive samples, while contending with issues such as noisy labels and minimal distinctions between preferred and dispreferred response data. However, readily available toxic samples with clear safety distinctions are often filtered out, removing valuable negative references that could aid LLMs in safety alignment. In response, we propose Positive–Toxic Self-Alignment (PT-ALIGN), a novel safety self-alignment approach that minimizes human supervision by automatically refining positive and toxic samples and performing fine-grained dual instruction tuning. Positive samples are harmless responses, while toxic samples deliberately contain extremely harmful content, serving as a new supervisory signal. Specifically, we utilize LLM itself to iteratively generate and refine training instances by only exploring fewer than 50 human annotations. We then employ two losses, i.e., maximum likelihood estimation (MLE) and fine-grained unlikelihood training (UT), to jointly learn to enhance the LLM’s safety. The MLE loss encourages an LLM to maximize the generation of harmless content based on positive samples. Conversely, the fine-grained UT loss guides the LLM to minimize the output of harmful words based on toxic samples at the token-level, thereby guiding the model to decouple safety from effectiveness, directing it toward safer fine-tuning objectives, and increasing the likelihood of generating helpful and reliable content. Experiments on 9 popular open-source LLMs demonstrate the effectiveness of our PT-ALIGN for safety alignment, while maintaining comparable levels of helpfulness.

Abstract:
With the widespread application of online matching in online social networks, bilateral access control has become an increasingly promising access control paradigm as it enables on-demand data sharing while preserving data privacy. Nonetheless, existing solutions confront some security issues, such as only achieving a weaker (i.e., selective) security, disclosing user privacy to a third party which assists with matching. In this paper, we introduce a novel solution referred to as Self-constrained Attribute-based Matchmaking Encryption (SAME) that offers a fine-grained bilateral access control with self-constrained matching. Our solution allows both participants (i.e., sender and receiver) to specify fine-grained access control policies the other party must satisfy in order for the data to be shared. Meanwhile, our solution offers a self-constrained matching such that the receiver can autonomously control his decryption privilege in each time of decryption to ensure that only the ciphertexts matching his specified policy can be decrypted. Due to the self-constrained matching, the receiver does not need any help of a third party when decrypting and the receiver’s decryption key does not change with each specified policy. We present a concrete SAME scheme in asymmetric bilinear groups and prove the full security of the scheme under standard assumptions. The comprehensive performance comparison and analysis shows that the proposed scheme achieves millisecond-level key-generation and encryption, while reducing decryption overhead by approximately 20% compared to the most related solutions.

Abstract:
In the realm of deep learning, the veracity and integrity of the training data are pivotal for constructing reliable and transparent models. This study introduces the concept of Trustworthy Dataset Proof (TDP), which tackles the significant challenge of verifying the authenticity of training data as declared by trainers. Existing dataset provenance methods, which primarily aim at ownership verification rather than trust enhancement, often face challenges with usability and integrity. For instance, excessive operational demands and the inability to effectively verify dataset authenticity hinder their practical application. To address these shortcomings, we propose a novel technique termed Data Probe, which diverges from traditional watermarking by utilizing subtle variations in model output distributions to confirm the presence of a specific and small subset of training data. This model-agnostic approach improves usability by minimizing the intervention during the training process and ensures dataset integrity via a mechanism that only permits probe detection when the entire claimed dataset is utilized in training. Our study conducts extensive evaluations to demonstrate the effectiveness of the proposed data-probe-based TDP framework, marking a significant step toward achieving transparency and trustworthiness in the use of training data in deep learning.

Abstract:
With the growing demand for outsourcing data to decentralized storage systems, ensuring the integrity of outsourced data becomes a critical challenge. Existing auditing schemes, however, often assume single-copy or centralized models, and suffer from inefficiency, lack of public verifiability, or poor scalability in multi-replica settings. To address these limitations, we propose MPA, a lightweight and publicly verifiable auditing scheme tailored for multi-copy cloud storage. By integrating polynomial commitment schemes with Merkle trees, our design achieves efficient block-level integrity verification while enabling dynamic updates. To mitigate collusion between cloud service providers, each data copy is uniquely encrypted, and the audit process supports simultaneous verification across multiple providers. Furthermore, we introduce an optimized batch auditing mechanism that allows the verifier to aggregate proofs across different files and providers, reducing both computation and communication overhead. To enhance audit transparency and unpredictability, we adopt a blockchain-assisted challenge generation protocol based on commit-and-reveal randomness. Theoretical analysis and performance evaluation demonstrate that MPA achieves strong security guarantees under standard assumptions, while significantly outperforming existing solutions in terms of efficiency and scalability.

Abstract:
Different from traditional object detectors such as YOLO, Detection Transformers (DETR) have reshaped the landscape of object detection by replacing heuristic-driven components like Non-Maximal Suppression with a fully end-to-end framework based on one-to-one Hungarian matching. While the majority of research has focused on improving the slow training convergence of DETR, this work investigates their security from an adversarial perspective. We unveil a critical vulnerability stemming directly from DETR’s core design: the deterministic one-to-one mapping between object queries and ground-truth objects can be exploited. This allows an adversary to craft perturbations that selectively manipulate specific target objects, causing them to vanish or be misclassified, while still preserving the detection integrity of all other objects in the scene. Our initial analysis reveals that conventional gradient-based attacks are ill-suited for this task, as they induce unintended interference on non-target instances, a phenomenon we term as the “spillover effect”. To overcome this, we re-formulate the attack optimization by incorporating a novel penalty term that explicitly decouples the adversarial influence on target and non-target objects. Furthermore, we provide a theoretical analysis to derive perturbation bounds under which the optimal matching assignments remain invariant, offering deeper insights into the model’s stability. Extensive experiments on standard benchmarks demonstrate that our proposed attack significantly improves the success rate and convergence speed while inducing far fewer feature-level artifacts, making the attack both more effective and stealthier. The source code is available at: https://github.com/Hill-Wu-1998/coa

Abstract:
Open-set gait recognition presents a critical challenge for real-world identity authentication systems, requiring simultaneous identification of known users and detection of unknown users under practical deployment conditions. However, in practical Wi-Fi sensing environments, signal noise, clothing variation, and multipath interference often blur the boundary between known and unknown classes, making traditional closed-set methods inadequate. To address this challenge, GUARD is proposed as a unified open-set and closed-set gait recognition framework based on feature reconstruction. The core idea is that known-class samples can be accurately reconstructed under matched label conditions, while unknown samples yield significantly higher reconstruction errors due to label mismatch, thereby providing a discriminative signal for open-set recognition. To enhance the stability and discriminability of features, GUARD integrates a Global Temporal Attention (GTA) mechanism to capture long-range temporal dependencies, and introduces a Pseudo-Gaussian Enhanced Self-Attention (PGESA) module that models dynamic attention distributions via Gaussian approximation, enabling selective emphasis on salient temporal features while effectively suppressing background noise. Additionally, a feature extractor locking strategy is employed to freeze identity-relevant representations once closed-set performance is optimized, preventing degradation during open-set training. Experimental results show that GUARD achieves over 20% improvement in open-set recognition rate, while maintaining approximately 95% closed-set accuracy, demonstrating superior robustness and generalization in complex sensing environments.

Abstract:
Chaotic antenna arrays (CAAs) have been shown to produce enhanced and robust RF fingerprints, enabling more reliable device authentication using machine learning (ML) compared to traditional methods based on subtle hardware imperfections. In this paper, we present a novel CAA-specific multipath channel model to accurately capture the CAA’s phase errors across all propagation paths providing a basis for processing schemes that remove the channel effect, enabling extraction of the RF signature of the CAA. In addition, we provide a comprehensive experimental validation of CAA-based authentication under practical wireless channel conditions and include full details covering the design, fabrication, and characterization of custom CAA nodes, along with their integration within a software-defined radio (SDR) testbed for over-the-air measurements. This manuscript shows, for the first time, that training of the ML-based authenticator on the CAA RF fingerprints can be conducted under line-of-sight (LOS) conditions and effectively generalized to diverse scenarios, including non-line-of-sight (NLOS) environments, as long as a dominant propagation path exists. High authentication accuracy is consistently achieved when this key spatial channel characteristic is preserved. Accuracy exceeds 90% in LOS and reaches up to 87% in NLOS conditions with a single dominant reflected path. This study provides the first practical validation of spatially varying CAA fingerprints, underscoring their promise for secure and robust physical layer authentication across varied wireless conditions.

Abstract:
Recent studies show that deep neural networks are extremely vulnerable, especially vulnerable to adversarial examples in image classification models. However, existing defenses suffer from limited adaptability across attacks, an unfavorable trade-off between clean accuracy and robustness, and substantial training-time overhead. To tackle these problems, we present a novel component, named the redundant fully connected layer, which can be combined with existing model backbones in a pluggable manner. Specifically, we design a tailor-made loss function for it that leverages cosine similarity to maximize the difference and diversity of multiple fully connected parts. We conduct extensive experiments against 12 representative attacks (white-box and black-box), based on two popular datasets. The empirical evaluations show that our scheme realizes significant outcomes against various attacks with negligible additional training overhead, while hardly degrading clean-sample accuracy.

Abstract:
With the evolution of cloud storage, data outsourcing has become a common trend. While cloud storage provides convenient services, its privacy and security issues have also attracted widespread attention. Dynamic searchable symmetric encryption (DSSE) provides an efficacious means of addressing data privacy issues by allowing data owners to directly perform retrieval or update operations on data while it is stored in the cloud. However, traditional DSSE cannot meet the needs of multi-user scenarios in practical applications. In addition, DSSE implements forward privacy and backward that is resistant to file-injection attack and support validation of search results. Therefore, we present verifiable multi-user dynamic searchable symmetric encryption with forward and backward privacy feasible for cloud storage (VM-DSSE-FB). The scheme constructs a DSSE framework that satisfies both forward and backward privacy by fusing symmetric encryption with homomorphic addition and bitmap indexing techniques. Meanwhile, we set up an encrypted update queue for each user to adapt to the requirements of multi-user scenarios. In addition, with the help of bitmap indexing and logic AND operations, we realize the accurate multi-keyword query function. In particular, the cloud server returns the search results with a workload proof to realize the effective verification of the integrity and correctness of the search results. Security analysis proves that VM-DSSE-FB approach achieves adaptive security. The performance analysis shows that compared with existing similar schemes, VM-DSSE-FB achieves more functionality while improving efficiency and is more suitable for cloud storage services.

Abstract:
In the Internet of Medical Things (IoMT), transmitted medical data contain sensitive patient information. To protect patient identity privacy and address the adaptability limitations of traditional ring signatures, Xu et al. proposed the anonymous sequential multi-signature ring signature (ASMR) to enable secure medical data sharing in the IoMT. Although Xu et al. claim that the ASMR scheme is secure, our analysis demonstrates that it is insecure, as it is existentially and universally forgeable, allowing any entity to generate a valid ring signature for arbitrary messages. Following the presentation of the attack, we analyze the causes of these vulnerabilities and propose corresponding countermeasures.

Abstract:
Dynamic Searchable Symmetric Encryption (DSSE) allows clients to update data and search keywords securely over symmetrically encrypted data on an honest but curious server. Conjunctive DSSE, an attractive type of DSSE with expressive search, enables clients to find data containing multiple keywords simultaneously. However, recently proposed efficient conjunctive DSSE schemes, such as ODXT (in NDSS’21) and SDSSE-CQ (in PETS’25), all rely on cross-tag techniques and suffer from either forward privacy or volume-privacy leakages arising from conjunctive keywords, making them vulnerable to injection or leakage-abuse attacks. In this work, we analyze the aforementioned works in depth and design a new conjunctive DSSE scheme named FDXT. For any search query with multiple keywords, FDXT guarantees the forward privacy of all queried keywords. In contrast, ODXT only maintains the forward privacy of the single and first queried keyword. FDXT also avoids volume leakage compared with SDSSE-CQ. Finally, we compared FDXT with ODXT and SDSSE-CQ in terms of performance on the Crime, Wikipedia, and Enron datasets. The experimental results show that FDXT exhibits good performance, which is comparable to ODXT and significantly better than SDSSE-CQ.

Abstract:
The challenge of fraud detection, especially in credit card transactions, continues to grow as fraudsters adapt to increasingly sophisticated tactics. Traditional methods, including rule-based systems, are often limited by high false positives and poor adaptability to new fraud patterns. This study introduces a cutting-edge approach using an attention-based ensemble framework that synergizes the strengths of convolutional neural networks (CNNs), graph neural networks (GNNs), recurrent neural networks (RNNs), and long short-term memory (LSTM) models. A key innovation of this model is its confidence-driven gating mechanism, which dynamically combines the most reliable prediction, ensuring both accuracy and transparency. By incorporating dependent ordered weighted averaging (DOWA) and induced ordered weighted averaging (IOWA) operators, the model effectively integrates outputs from various classifiers, improving robustness. Additionally, SHAP-based feature selection reveals the most influential variables, providing deeper interpretability. Extensive evaluations across three diverse benchmark datasets demonstrate the superior performance of this framework, surpassing individual classifiers and existing ensemble models in both balanced and highly imbalanced settings. This research paves the way for more reliable and transparent fraud detection systems, capable of adapting to evolving fraud tactics in real-world financial environments.

Abstract:
Federated Learning (FL) is vulnerable to backdoor attacks by design since it cannot inspect clients’ local data to protect their privacy. This privacy-preserving feature creates an opportunity for malicious clients to introduce backdoors. However, existing backdoor attacks face two main limitations. First, brute amplification (i.e., uniformly scaling up malicious parameters) can be easily detected, hence compromising attack stealthiness. Second, evasion strategies employed to prevent their backdoors from being overwritten by benign updates are frequently ineffective, reducing the overall attack stability upon model deployment. To address these limitations, we propose an adaptive proactive boosting strategy to enhance both the stealthiness and durability of backdoor attacks in FL. As a concrete example, ReBA introduces a durable importance metric based on stability degrees of parameters as an update mask for malicious attackers, assigning higher weights to backdoor-related parameters during the update process. To ensure stealthiness, ReBA formulates an optimization problem regarding amplification factor by minimizing the distance between malicious and clean updates, thereby correcting malicious updates within a benign distance space. Extensive evaluations on 3 datasets and across 14 defenses demonstrate the efficacy of ReBA, outperforming over 12 baseline backdoor attacks.

Abstract:
Face recognition (FR) brings convenience to people’s lives while also posing security risks. Some malicious users employ FR attacks to impersonate the identity of a target. To reveal the security risks, recent work has attacked black-box FR models by utilizing substitute models to generate adversarial face images that are misclassified as the target individual due to the attack transferability of substitute models. However, the substitute models cannot accurately approximate the target model that leads to a decrease in FR attack success rate and adversarial face image quality. To address the issue, we propose the PPOM-Attack, a substitute model-free Perturbation Prediction and Optimization Method for black-box adversarial Attack against face recognition. PPOM-Attack directly obtains feedback from the target model instead of using substitute models, it avoids any discrepancy with the attack objective. To achieve this goal, we design a proximal policy optimization (PPO)-based agent to predict the perturbation regions in the face image and self-adaptively disturb the regions. To maintain high-quality adversarial face images, we further propose a minimum brightness offsets method specifically designed to generate perturbations that minimize the feature embedding difference between the adversarial and targeted face images. The experimental results show that our approach outperforms state-of-the-art FR attack methods by an average of 21.7% in terms of attack success rate, while achieving better image quality on seven FR models.

Abstract:
Privacy labels (e.g., Data Safety section on Google Play) aim to replace lengthy privacy policies with concise and standardized summaries of in-app privacy practices. However, studies reveal widespread inaccuracies in these self-reported labels, with developers omitting or misrepresenting privacy practices, undermining user trust and regulatory compliance. Existing methods for detecting such discrepancies lack coverage or scalability and fail to address the semantic ambiguity inherent in privacy label auditing. We present Iterative Context Reconstruction (ICR), an evidence-driven workflow that reconstructs context from decompiled code to resolve the ambiguity. Based on ICR, PriLabel is a context-aware static auditor that comprehensively uncovers omitted disclosures in Android privacy labels, mapping transmitted data to Google’s label taxonomy in a source-free and ontology-free manner. Our evaluation demonstrates PriLabel’s high precision (91.5%) in detecting omitted disclosures in privacy labels. Applied to 4,851 top-installed Google Play apps, it revealed that 2,374 apps omitted at least one disclosure, with 210 transmitting sensitive financial data (e.g., credit card numbers) without proper labeling, exposing systemic risks of non-compliance.

Abstract:
Multi-agent systems (MASs) in open networks face dual security threats: Byzantine attacks that steer malicious consensus and eavesdroppers that steal private information. Existing resilient consensus protocol isolate Byzantine attacks by relying on (2f + 1) -robust networks, which imposes stringent topological constraints and fails to provide privacy preservation simultaneously. To address this issue, an improved resilient consensus protocol with f (IRCP- f ) is proposed, via absolute values of relative states, to defend against f -local Byzantine attacks. This protocol only requires that an undirected and connected graph is (f + 1) -robust instead of (2f + 1) -robust. The properties of the dynamic network processed by the IRCP- f are analyzed, and the graph conditions for achieving average consensus are consequently satisfied. A fully distributed differentially private event-triggered average consensus (DPETAC) control scheme is then developed. With the DPETAC control scheme, convergence analysis, Zeno behavior analysis, accuracy analysis and privacy analysis are presented for the MAS. Finally, a numerical simulation illustrates the feasibility and effectiveness of the proposed privacy-preserving average consensus control scheme.

Abstract:
The widespread adoption of Large Language Models (LLMs) in critical applications has introduced severe reliability and security risks, as LLMs remain vulnerable to notorious threats such as hallucinations, jailbreak attacks, and backdoor exploits. These vulnerabilities have been weaponized by malicious actors, leading to unauthorized access, widespread misinformation, and compromised LLM-embedded system integrity. In this work, we introduce a novel approach to detecting abnormal behaviors in LLMs via hidden state forensics. By systematically inspecting layer-specific activation patterns, we develop a general framework that can efficiently identify a range of security threats in real-time without imposing prohibitive computational costs. Extensive experiments indicate detection accuracies exceeding 95% and consistently robust performance across multiple models in most scenarios, while preserving the ability to detect novel attacks effectively. Furthermore, the computational overhead remains minimal, with detector inference taking merely fractions of a second. The significance of this work lies in proposing a promising strategy to reinforce the security of LLM-integrated systems, paving the way for safer and more reliable deployment in high-stakes domains. By enabling real-time detection that can also support the mitigation of abnormal behaviors, it represents a meaningful step toward ensuring the trustworthiness of AI systems amid rising security challenges.

Abstract:
Searchable encryption (SE) allows users to perform private queries on encrypted databases. Although SE schemes can protect data privacy, some often pursue high performance while allowing certain leakages, such as search and access patterns. Exploiting such leakage together with other knowledge similar to the encrypted database, an attacker can recover queries. State-of-the-art attacks (Nie et al., USENIX’ 24) on single-keyword queries achieve accuracies exceeding 90%. More recently, the community has focused on the more challenging attack of recovering multi-keyword queries, with the most advanced attacks (Liu et al., TIFS’ 25) achieving over 80% accuracy. Although these attacks can effectively recover queries, they all rely on a large amount of similar documents, requiring the attacker to possess documents equivalent in volume to the database. This naturally raises a question: Can we achieve higher-accuracy attacks using less similar data? Less information makes attacks easier to implement. Motivated by this, we present Mirage, an attack that recovers both single-keyword and multi-keyword queries while requiring only partial similar data. Our core idea is to first identify some special queries and design a series of novel algorithms to recover them. Then, partially reconstruct the database index and recover the remaining queries. Extensive experiments conducted across various real-world datasets demonstrate the effectiveness of our attack. The results show that when the attacker observes 51 time intervals and obtains only 0.5% of similar documents in each interval, Mirage achieves 91.6% and 95.4% accuracy on the Enron and Lucene datasets for single-keyword queries, respectively. For multi-keyword queries, Mirage achieves up to 90.7% and 93.8% recovery accuracy, respectively.

Affiliations: School of Computer Science and Artificial Intelligence, Shandong Normal University, Jinan, China; Information Security Center, State Key Laboratory of Networking and Switching Technology, and the National Engineering Laboratory for Disaster Backup and Recovery, Beijing University of Posts and Telecommunications, Beijing, China; Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan, China

Abstract:
With the advent of intelligent technologies, miscellaneous data containing sensitive information are explosively generated and shared. Compressive sensing methods are naturally suitable for such scenarios due to their joint compression and encryption capabilities. However, data users of most existing compressive sensing methods need to reconstruct original images before use, which brings two disadvantages. First, indiscriminately requiring every data user to reconstruct images without considering their exact requirements is neither advisable nor efficient. Second, allowing all data users to reconstruct original images may cause private or confidential information exposure. To address these issues, in this paper, a novel image sharing method is proposed, which realizes efficient multilevel privacy preservation. Specifically, data owners compress the original images with designed measurement matrices through the proposed T \ell _1 -B2DLDA algorithm, which outputs dimension-reduced data with the ability to simultaneously support the subsequent classification tasks for level I data users and reconstruction tasks for level II data users. Therefore, low level data users could achieve their goals without obtaining any private or confidential information in the original images. Experiments are conducted to verify the feasibility, performance and robustness of the proposed method. Furthermore, the security of the proposed method is analyzed both theoretically and practically. The source code of the proposed method is publicly available at https://github.com/xchuxiao23/TL1-B2DLDA

Abstract:
In this study, we apply the information-theoretic Privacy Funnel (PF) model to face recognition and develop a method for privacy-preserving representation learning within an end-to-end trainable framework. Our approach addresses the trade-off between utility and obfuscation of sensitive information under logarithmic loss. We study the integration of information-theoretic privacy principles with representation learning, with a particular focus on face recognition systems. We also highlight the compatibility of the proposed framework with modern face recognition networks such as AdaFace and ArcFace. In addition, we introduce the Generative Privacy Funnel ( \textsf GenPF ) model, which extends the traditional discriminative PF formulation, referred to here as the Discriminative Privacy Funnel ( \textsf DisPF ). The proposed \textsf GenPF model extends the privacy-funnel framework to generative formulations under information-theoretic and estimation-theoretic criteria. Complementing these developments, we present the deep variational PF (DVPF) model, which yields a tractable variational bound for measuring information leakage and enables optimization in deep representation-learning settings. The DVPF framework, associated with both the \textsf DisPF and \textsf GenPF models, also clarifies connections with generative models such as variational autoencoders (VAEs), generative adversarial networks (GANs), and diffusion models. Finally, we validate the framework on modern face recognition systems and show that it provides a controllable privacy–utility trade-off while substantially reducing leakage about sensitive attributes. To support reproducibility, we also release a PyTorch implementation of the proposed framework.

Abstract:
As the Internet of Things (IoT) and Sixth Generation (6G) technologies advance rapidly, the cryptographic identification of electronic devices has become a critical issue in information security. Radio frequency fingerprint (RFF)-based specific emitter identification (SEI) has emerged as a prominent physical-layer authentication technique. To enhance the stability and accuracy of multi-target recognition in complex electromagnetic environments, a novel technique for specific emitter identification based on time-frequency similarity contrastive learning (TFSCL) is proposed. In this study, we present a novel pre-training method utilizing a deep complex-valued pyramid network (DCPN) to enhance the extraction and reconstruction of time series and frequency domain sequences. The DCPN enables contrastive learning of signal features in both the temporal and frequency domains, significantly reducing computational complexity and improving pretraining performance. Additionally, we first introduce the Time-Frequency Synchronization Data Added (TFS-DA), a Time-Frequency Hybrid Data Added Technique that employs Gray code to generate random sequences, effectively improving feature representation in both domains. Empirical results demonstrate that the proposed method achieves an accuracy rate of 97.12% on an automatic-dependent surveillance-broadcast (ADS-B) dataset that contains 10 categories with only data labeled 10%. On a LoRa dataset containing 30 categories with only data labeled 10%, the accuracy rate reaches 77.06%.